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METHYLATED CpG ISLAND AMPLIFICATION (MCA) 

STATEMENT AS TO FEDERALLY SPONSORED RESEARCH 

This invention was made with Government support under Grant No. CA43318 and 
CA54396, awarded by the National Cancer Institute and Grant No. CA43318, a Colon Cancer 
Spore Grant. The government may have certain rights in the invention. 

5 

FIELD OF THE I NYENTIQ N 

The present invention relates generally to regulation of gene expression and more 
specifically to a method of determining the DN A methylation status of CpG sites in a given 
locus. 

10 

BACKGROUND OF THE INVENTION 

DNA methylases transfer methyl groups from the universal methyl donor S-adenosyl 
methionine to specific sites on the DNA. Several biological functions have been attributed to 
the methylated bases in DNA. The most established biological function for methylated DNA 

15 is the protection of DNA from digestion by cognate restriction enzymes. The restriction 
modification phenomenon has, so far, been observed only in bacteria. Mammalian cells, 
however, possess a different methylase that exclusively methylates cytosine residues that are 
5' neighbors of guanine (CpG). This modification of cytosine residues has important 
regulatory effects on gene expression, especially when involving CpG rich areas, known as 

20 CpG islands, located in the promoter regions of many genes. 

Methylation has been shown by several lines of evidence to play a role in gene 
activity, cell differentiation, tumorigenesis, X-chromosome inactivation, genomic imprinting 
and other major biological processes (Razin, A., H., and Riggs, R.D. eds. in DNA 
25 Methylation Biochemistry and Biological Significance . Springer- Verlag, New York, 1984). 
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Methylation has been shown by several lines of evidence to play a role 
in gene activity, cell differentiation, tumorigenesis, X-chromosome 
inactivation, genomic imprinting and other major biological processes 
(Razin, A. H. and Riggs, R.D. eds. in DNA Methylation Biochemistry and 
5 Biological S ignificance. Springer-Verlag, New York, 1984). In eukaryotic 
cells, methylation of cytosine residues that are immediately 5' to a guanosine, 
occurs predominantly in CG poor regions (Bird, A., Nature, 221-209, 1986). 
In contrast, CpG islands remain unmethylated in normal cells, except during 
X-chromosome inactivation (Migeon, et al, supra) and parental specific 

10 imprinting (Li, et al, Nature, 1££:362, 1993) where methylation of 5 f 

regulatory regions can lead to transcriptional repression. De novo methylation 
of the Rb gene has been demonstrated in a small fraction of retinoblastomas 
(Sakai, et aL, Am. J. Hum. Genet., 4g:880, 1991), and recently, a more detailed 
analysis of the VHL gene showed aberrant methylation in a subset of sporadic 

15 renal cell carcinomas (Herman, et al, Proc. Natl Acad. Set, U.S.A., £1:9700, 
1994). Expression of a tumor suppressor gene can also be abolished by de 
novo DNA methylation of a normally unmethylated CpG island (Issa, et al, 
Nature Genet., 2:536, 1994; Herman, et al, supra; Merlo, et al, Nature Med, 
1:686, 1995; Herman, et al, Cancer Res., 5£:722, 1996; Graff, et al, Cancer 

20 Res., 55:5195, 1995; Herman, et al, Cancer Res., 55:4525, 1995). 

Human cancer cells typically contain somatically altered nucleic acid, 
characterized by mutation, amplification, or deletion of critical genes. In 
addition, the nucleic acid from human cancer cells often displays somatic 
25 changes in DNA methylation (E.R. Fearon, et al, Cell, £1:759, 1990; 

P.A. Jones, et al, Cancer Res., 46:461, 1986; R. Holliday, Science, 21£:163, 
1987; A. De Bustros, et al, Proc. Natl Acad. Set, USA, £5:5693, 1988); 
P.A Jones, et al, Adv. Cancer Res., 54:1, 1990; S.B. Baylin, et al, Cancer 
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Cells,3.:383, 1991; M. Makos, et al, Proc. Natl. Acad. Set, USA, £9:1929, 
1992; N. Ohtani-Fujita, et al, Oncogene, £:1063, 1993). However, the precise 
role of abnormal DNA methylation in human tumorigenesis has not been 
established. Aberrant methylation of normally unmethylated CpG islands has 
5 been described as a frequent event in immortalized and transformed cells, and 
has been associated with transcriptional inactivation of defined tumor 
suppressor genes in human cancers. In the development of colorectal cancers 
(CRC), a series of tumor suppressor genes (TSG) such as APC, p53, DCC and 
DPC4 are inactivated by mutations and chromosomal deletions (reviewed in 

10 Kinzler and Vogelstein 1996). Some of these alterations result from a 

chromosomal instability phenotype described in a subset of CRC (Lengauer et 
al, 1997a). Recently, an additional pathway has been shown to be involved in 
a familial form of CRC, hereditary non-polyposis colorectal cancer. The 
cancers from these patients show a characteristic mutator phenotype which 

1 5 causes microsatellite instability (MI), and mutations at other gene loci such as 
TGF-J3-RII (Markowitz et al. , 1 995) and BAX (Rampino et al , 1 997). This 
phenotype usually results from mutations in the mismatch repair (MMR) 
genes hMSH2 and hMLHl (reviewed by Peltomaki, and de la Chapelle, 1997). 
A subset of sporadic CRC also show MI, but mutations in MMR genes appear 

20 to be less frequent in these tumors (Liu et al, 1995; Moslein et al, 1996). 

Another molecular defect described in CRC is CpG island (CGI) 
methylation. CGIs are short sequences rich in the CpG dinucleotide and can 
be found in the 5' region of about half of all human genes (Bird, 1986). 
25 Methylation of cytosine within 5' CGIs is associated with loss of gene 
expression and has been seen in physiological conditions such as X 
chromosome inactivation and genomic imprinting (reviewed in Latham, 
1996). Aberrant methylation of CGIs has been detected in genetic diseases 
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such as the fragile-X syndrome (Hansen et al, 1992), in aging cells (Issa et 
ah , 1 994) and in neoplasia. About half of the tumor suppressor genes which 
have been shown to be mutated in the germline of patients with familial cancer 
syndromes have also been shown to be aberrantly methylated in some 
5 proportion of sporadic cancers, including Rb, VHL,pl6, hMLHl, and BRCA1 
(reviewed in Baylin et al, 1998; Jones 1997). TSG methylation in cancer is 
usually associated with (1) lack of gene transcription and (2) absence of 
coding region mutation. Thus it has been proposed that CGI methylation 
serves as an alternative mechanism of gene inactivation in cancer. 

10 

The causes and global patterns of CGI methylation in human cancers 
remain poorly defined. Aging could play a factor in this process because 
methylation of several CGIs could be detected in an age-related manner in 
normal colon mucosa as well as in CRC (Issa et ai, 1994). In addition, 

1 5 aberrant methylation of CGIs has been associated with the MI phenotype in 
CRC (Ahuja et aL, 1997) as well as specific carcinogen exposures (Issa et al. 9 
1996). However, an understanding of aberrant methylation in CRC has been 
somewhat limited by the small number of CGIs analyzed to date. In fact, 
previous studies have suggested that large numbers of CGIs are methylated in 

20 immortalized cell lines (Antequera et al., 1990), and it is not well understood 
whether this global aberrant methylation is caused by the cell culture 
conditions or whether they are an integral part of the pathogenesis of cancer. 

Most of the methods developed to date for detection of methylated 
25 cytosine depend upon cleavage of the phosphodiester bond alongside cytosine 
residues, using either methylation-sensitive restriction enzymes or reactive 
chemicals such as hydrazine which differentiate between cytosine and its 5- 
methyl derivative. Genomic sequencing protocols which identify a 5-MeC 
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residue in genomic DNA as a site that is not cleaved by any of the Maxam 
Gilbert sequencing reactions have also been used, but still suffer 
disadvantages such as the requirement for large amount of genomic DNA and 
the difficulty in detecting a gap in a sequencing ladder which may contain 
5 bands of varying intensity. 

Mapping of methylated regions in DNA has relied primarily on 
Southern hybridization approaches, based on the inability of methylation- 
sensitive restriction enzymes to cleave sequences which contain one or more 
1 0 methylated CpG sites. This method provides an assessment of the overall 

methylation status of CpG islands, including some quantitative analysis, but is 
relatively insensitive and requires large amounts of high molecular weight 
DNA. 

15 Another method utilizes bisulfite treatment of DNA to convert all 

unmethylated cytosines to uracil. The altered DNA is amplified and 
sequenced to show the methylation status of all CpG sites. However, this 
method is technically difficult, labor intensive and without cloning amplified 
products, it is less sensitive than Southern analysis, requiring approximately 

20 1 0% of the alleles to be methylated for detection. 

Identification of the earliest genetic changes in tumorigenesis is a 
major focus in molecular cancer research. Diagnostic approaches based on 
identification of these changes are likely to allow implementation of early 
25 detection strategies and novel therapeutic approaches targeting these early 
changes might lead to more effective cancer treatment. 



WO 00/26401 



PCT/US99/25251 



-6- 

SIJMMARY OF THE TNVFNTTON 

The invention provides a method for detecting a methylated CpG- 
containing nucleic acid. This method can be used to identify sequences which 
are differentially methylated during a disease process such as a cell 
proliferative disorder. 

In one embodiment, a method is provided for identifying a methylated 
CpG-containing nucleic acid. The method includes contacting a nucleic acid 
sample suspected of containing a CpG-containing nucleic acid, with a 
methylation sensitive restriction endonuclease that cleaves only unmethylated 
CpG sites, under conditions and for a time to allow cleavage of unmethylated 
nucleic acid; and contacting the sample with an isoschizomer of the 
methylation sensitive restriction endonuclease, wherein the isoschizomer of 
the methylation sensitive restriction endonuclease cleaves both methylated and 
unmethylated CpG sites. Oligonucleotides are added to the nucleic acid 
sample under conditions and for a time to allow ligation of the 
oligonucleotides to nucleic acid cleaved by the restriction endonuclease and 
the digested nucleic acid is amplified for further analysis. 

In another embodiment, a method is provided for detecting an age- 
associated disorder associated with methylation of CpG islands in a nucleic 
acid sequence of interest in a subject having or at risk of having said disorder. 
The method includes contacting a nucleic acid sample suspected of comprising 
a CpG-containing nucleic acid with a methylation sensitive restriction 
endonuclease that cleaves only unmethylated CpG sites under conditions and 
for a time to allow cleavage of unmethylated nucleic acid, and contacting the 
sample with an isoschizomer of the methylation sensitive restriction 
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endonuclease, wherein the isoschizomer of the methylation sensitive 
restriction endonuciease cleaves both methylated and unmethylated CpG sites. 
Oligonucleotides are added to the nucleic acid sample under conditions and 
for a time to allow ligation of the oligonucleotides to nucleic acid cleaved by 
5 the restriction endonuciease, and the digested nucleic acid is amplified. The 
amplified, digested nucleic acid is contacted with a membrane and the 
membrane is hybridized with a probe of interest. 

In yet another embodiment, a method is provided for evaluating the 
1 0 response of a cell to an agent. The method includes contacting a nucleic acid 
sample suspected of containing a CpG-containing nucleic acid with a 
methylation sensitive restriction endonuciease that cleaves only unmethylated 
CpG sites, under conditions and for a time to allow cleavage of unmethylated 
nucleic acid, and contacting the sample with an isoschizomer of the 
1 5 methylation sensitive restriction endonuciease, wherein the isoschizomer of 
the methylation sensitive restriction endonuciease cleaves both methylated and 
unmethylated CpG sites. Oligonucleotides are added to the nucleic acid 
sample under conditions and for a time to allow ligation of the 
oligonucleotides to nucleic acid cleaved by the restriction endonuciease, and 
20 the digested nucleic acid is amplified. The amplified, digested nucleic acid is 
adhered to a membrane and the membrane is hybridized with a probe of 
interest. 

In a further embodiment, a kit for the detection of a methylated CpG- 
25 containing nucleic acid is provided. In one embodiment the kit includes a 
carrier means containing one or more containers including a container 
containing an oligonucleotide for ligation of the oligonucleotides to nucleic 
acid, a second container containing a methylation sensitive restriction 
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endonuclease and a third container containing an isoschizomer of the 
methylation sensitive endonuclease. In another embodiment the kit includes a 
carrier means containing one or more containers containing a membrane, 
wherein the membrane has a member of the group consisting SEQ ID NO:l, 
5 SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 9, 
SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID 
NO: 19, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 
SEQ ID NO:27, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, and SEQ ID 
NO:33 (MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, 
10 MINT14, MINT1 5, MINT17, MINT 19, MINT20, MINT22, MINT23, 

MINT24, MINT27, MINT30, MINT31, MINT32, and MIND 3 immobilized 
on the membrane. 

In a further embodiment, an isolated nucleic acid including a member 
15 selected from SEQ ID NO: 1 , SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID 
NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO:22, 
SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:27, SEQ ID NO:30, SEQ ID 
NO:3 1, SEQ ID NO:32, and SEQ ID NO:33 (MINT1, MINT2, MINT4, 
20 MINT6, MINT8, MINT 9, MINT10, MINT 14, MINT15, MINT17, MINT19, 
MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, 
MINT32, and MINT33) is provided. An isolated methylated nucleic acid 
sequence having a sequence as set forth in a member of the group consisting 
of SEQ ID NOs:l-33 (MINT1-33) is also provided. 



25 
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BRflEF DESC RIPTION OF THE DRAWINGS 

FIG. 1 is a schematic diagram of MCA. A hypothetical fragment of 
genomic DNA is represented by a solid line, with 7 Smal sites depicted by tick 
marks. Methylated Smal sites are indicated by an m. Fragments B and D are 
CpG islands. B is methylated in both normal (right) and cancer (left), while D 
is differentially methylated in cancer. For MCA, unmethylated Smal sites are 
eliminated by digestion with Smal (which is methylation-sensitive and does 
not cleave when its recognition sequence CCCGGG contains a methylated 
CpG), which leaves the fragment blunt ended. Methylated Smal sites are then 
digested with the non-methylation sensitive Smal isoschizomer Xmal, which 
digests methylated CCCGGG sites, leaving a CCGG overhang (sticky ends). 
Adaptors are ligated to these sticky ends, and PCR is performed to amplify the 
methylated sequences. The MCA amplicons can be used directly in a dot blot 
analysis to study the methylation status of any gene for which a probe is 
available (left). Alternatively, MCA products can be used to clone 
differentially methylated sequence by RDA (right). 

FIG. 2 shows an the nucleotide sequence of a differentially Methylated 
Clone, MINT2 obtained by MCA Followed by RDA. The restriction 
endonuclease sites for Smal are underlined. Primer sequences used for 
bisulfite-PCR are also underlined. The restriction endonuclease site for BstUI 
used to detect methylation after bisulfite PCR is shown by a gray box. 

FIG. 3 show a map of the versican gene first exon (filled box) and 
flanking regions. The position of MINT 1 1 is shown by a solid line (on top). 
CpG sites are indicated below. Location of the primers used for bisulfite-PCR 
are shown by arrows. 
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FIG. 4 is a pictorial representation of global hypermethylation in CRC. 
Each column represents a separate gene locus. Each row is a primary 
colorectal cancer (samples above the bold solid line) or polyp ( below the bold 
solid line). Black squares: methylation > 10%. Gray squares: 1-10% 
5 methylation. White squares: < 1% methylation. A: GH+MI+, B: GH+MI-, C: 
GH-MI+, D: GH-MI-, E: GH+, F: GH-. A-D are cancers. E and F are 
adenomas. MI denotes the presence of micro satellite instability. ND, not done. 

FIG. 5 shows a model integrating CGI methylation in colorectal 
1 0 carcinogenesis. 

FIGS. 6A-H are the nucleic acid sequences of MINT1-33 (SEQ ID 
NO: 1-33). 

15 DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention provides a method for identifying a methylated 
CpG-containing nucleic acid called methylated CpG island amplification 
(MCA). MCA can be used to study methylation in normal and neoplastic 
cells, and allows rapid screening of nucleic acid samples for the presence of 
20 hypermethylation of specific genes. MCA can also be used to clone genes and 
nucleic acid sequences differentially methylated in normal and abnormal 
tissues and cells. 

It should be noted that as used herein and in the appended claims, the 
25 singular forms "a," "and," and "the" include plural referents unless the context 
clearly dictates otherwise. Thus, for example, reference to "a cell" includes a 
plurality of such cells and reference to "the restriction enzyme" includes 
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reference to one or more restriction enzymes and equivalents thereof known to 
those skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein 
have the same meaning as commonly understood to one of ordinary skill in the 
art to which this invention belongs. Although any methods, devices and 
materials similar or equivalent to those described herein can be used in the 
practice or testing of the invention, the preferred methods, devices and 
materials are now described. 

All publications mentioned herein are incorporated herein by reference 
in full for the purpose of describing and disclosing the methodologies which 
are described in the publications which might be used in connection with the 
presently described invention. The publications discussed above and 
throughout the text are provided solely for their disclosure prior to the filing 
date of the present application. Nothing herein is to be construed as an 
admission that the inventors are not entitled to antedate such disclosure by 
virtue of prior invention. 

Any nucleic acid sample, in purified or nonpurified form, can be 
utilized as the starting nucleic acid or acids, provided it contains, or is 
suspected of containing, a nucleic acid sequence containing the target locus 
(e.g., CpG-containing nucleic acid). In general the CpG-containing nucleic 
acid will be DNA. However, the process may employ, for example, samples 
that contain DNA, or DNA and RNA, including messenger RNA, wherein 
DNA or RNA may be single stranded or double stranded, or a DNA-RNA 
hybrid may be included in the sample. A mixture of nucleic acids may also be 
employed. The specific nucleic acid sequence to be detected may be a fraction 



WO 00/26401 



PCT/US99/25251 _ 



-12- 

of a larger molecule or can be present initially as a discrete molecule, so that 
the specific sequence constitutes the entire nucleic acid. It is not necessary 
that the sequence to be studied be present initially in a pure form; the nucleic 
acid may be a minor fraction of a complex mixture, such as contained in 
5 whole human DNA. The nucleic acid may be contained in a biological 
sample. Such samples include but are not limited to a serum, urine, saliva, 
cerebrospinal fluid, pleural fluid, ascites fluid, sputum, stool, or biopsy 
sample. The nucleic acid-containing sample used for detection of methylated 
CpG may be from any source including, but not limited to, brain, colon, 
1 0 urogenital, hematopoietic, thymus, testis, ovarian, uterine, prostate, breast, 
colon, lung and renal tissue and may be extracted by a variety of techniques 
such as that described by Maniatis, et ah ( Molecular Cloning: a Laboratory 
Manual Cold Spring Harbor, NY, pp 280, 281, 1982). 

1 5 The nucleic acid of interest can be any nucleic acid where it is 

desirable to detect the presence of a CpG island. In one embodiment, the CpG 
island comprises a CpG island located in a gene. A "CpG island" is a CpG 
rich region of a nucleic acid sequence. The nucleic acid sequence may be, for 
example, a pi 6, a Rb, a VHL, a hMLHl, or a BRCA1 gene. Alternatively the 

20 nucleic acid of interest can be, for example, a MINT1-33 nucleic acid 

sequence. However, any gene or nucleic acid sequence of interest containing 
a CpG sequence can be detected using the method of the invention. 

The presence of methylated CpG in the nucleic acid-containing 
25 specimen may be indicative of a disorder. In one embodiment, the disorder is 
a cell proliferative disorder. A "cell proliferative disorder" is any disorder in 
which the proliferative capabilities of the affected cells is different from the 
normal proliferative capabilities of unaffected cells. An example of a cell 
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proliferative disorder is neoplasia. Malignant cells (/. e. , cancer) develop as a 
result of a multistep process. Specific, non-limiting examples of disorders 
associated with increased methylation of CpG-islands are colon cancer, lung 
cancer, renal cancer, leukemia, breast cancer, prostate cancer, uterine cancer, 
5 astrocytoma, glioblastoma, and neuroblastoma. 

In another embodiment, the disorder is an age-associated disorder. The 
term "age-associated disorder" is used to describe a disorder observed with the 
biological progression of events occurring over time in a subject. Preferably, 

10 the subject is a human. Non-limiting examples of age-associated disorders 
include, but are not limited to, atherosclerosis, diabetes melitis, and dementia. 
An age-associated disorder may also be a cell proliferative disorder. 
Examples of age-associated disorders which are cell proliferative disorders 
include colon cancer, lung cancer, breast cancer, prostate cancer, and 

15 melanoma, amongst others. An age-associated disorder is further intended to 
mean the biological progression of events that occur during a disease process 
that affects the body, which mimic or substantially mimic all or part of the 
aging events which occur in a normal subject, but which occur in the diseased 
state over a shorter period of time. 

20 

In one embodiment, the age-associated disorder is a "memory 
disorders or learning disorders" which are characterized by a statistically 
significant decrease in memory or learning assessed over time by the Randt 
Memory Test (Randt et al. 9 Clin. Neuropsychol. , 2:184, 1980), Wechsler 
25 Memory Scale (J. Psych., 15:87-95, 1945), Forward Digit Span test (Craik, 
Age Differences in Human Memory, in: Handbook of the Psychology of 
Aging, Birren, J., and Schaie, K., Eds., New York, VanNostrand, 1977), 
Mini-Mental State Exam (Folstein et aL, J. of Psych. Res. 12:189-192, 1975), 
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or California Verbal Learning Test (CVLT) wherein such 
non-neurodegenerative pathological factors as aging, anxiety, fatigue, anger, 
depression, confusion, or vigor are controlled for. (See, U.S. Patent No. 
5,063,206 for example). 

If the sample is impure (e.g., plasma, serum, stool, ejaculate, sputum, 
saliva, cerebrospinal fluid or blood or a sample embedded in paraffin), it may 
be treated before amplification with a reagent effective for opening the cells, 
fluids, tissues, or animal cell membranes of the sample, and for exposing the 
nucleic acid(s). Methods for purifying or partially purifying nucleic acid from 
a sample are well known in the art (e.g. , Sambrook et al , Molecular Cloning: 
a Laboratory Manual . Cold Spring Harbor Press, 1 989, herein incorporated by 
reference). 

In one embodiment, a method is provided for identifying a methylated 
CpG-containing nucleic acid, including contacting a nucleic acid sample 
suspected of comprising a CpG-containing nucleic acid with a methylation 
sensitive restriction endonuclease that cleaves only unmethylated CpG sites 
under conditions and for a time to allow cleavage of unmethylated nucleic 
acid. The sample is further contacted with an isoschizomer of the methylation 
sensitive restriction endonuclease, that cleaves both methylated and 
unmethylated CpG-sites, under conditions and for a time to allow cleavage of 
methylated nucleic acid. Oligonucleotides are added to the nucleic acid 
sample under conditions and for a time to allow ligation of the 
oligonucleotides to nucleic acid cleaved by the restriction endonuclease, and 
the digested nucleic acid is amplified. Following identification, the 
methylated CpG-containing nucleic acid can be cloned, using method well 
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known to one of skill in the art (see Sambrook et al , Molecular Cloning: a 
Laboratory Manual . Cold Spring Harbor Press, 1989). 

A "methylation sensitive restriction endonuclease" is a restriction 
5 endonuclease that includes CG as part of its recognition site and has altered 
activity when the C is methylated as compared to when the C is not 
methylated. Preferably, the methylation sensitive restriction endonuclease has 
inhibited activity when the C is methylated (e.g., Smal). Specific non-limiting 
examples of a methylation sensitive restriction endonucleases include Smal, 

1 0 BssHll, or Hpall. Such enzymes can be used alone or in combination. Other 
methylation sensitive restriction endonucleases will be known to those of skill 
in the art and include, but are not limited to Sacll, Eagl, and BstUl, for 
example. An "isoschizomer" of a methylation sensitive restriction 
endonuclease is a restriction endonuclease which recognizes the same 

1 5 recognition site as a methylation sensitive restriction endonuclease but which 
cleaves both methylated and unmethylated CGs. One of skill in the art can 
readily determine appropriate conditions for a restriction endonuclease to 
cleave a nucleic acid (see Sambrook et al, Molecular Cloning; a Laboratory 
Manual . Cold Spring Harbor Press, 1989). Without being bound by theory, 

20 actively transcribed genes generally contain fewer methylated CGs than in 
other genes. 

In the method of the invention, a nucleic acid of interest is cleaved 
with a methylation sensitive endonuclease. In one embodiment, cleavage with 
25 the methylation sensitive endonuclease creates a sufficient overhang on the 
nucleic acid of interest. Following cleavage with the isoschizomer, the 
cleavage product can still have a sufficient overhang. An "overhang" refers to 
nucleic acid having two strands wherein the strands end in such a manner that 
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10 



a few bases of one strand are not base paired to the other strand. A "sufficient 
overhang" refers to an overhang of sufficient length to allow specific 
hybridization of an oligonucleotide of interest. In one embodiment, a 
sufficient overhang is at least two bases in length. In another embodiment, the 
sufficient overhang is four or more bases in length. An overhang of a specific 
sequence on the nucleic acid of interest may be desired in order for an 
oligonucleotide of interest to hybridize. In this case, the isoschizomer can be 
used to create the overhang having the desired sequence on the nucleic acid of 
interest. 



In another embodiment, the cleavage with a methylation sensitive 
endonuclease results in a reaction product of the nucleic acid of interest that 
has a blunt end or an insufficient overhang. In this embodiment, an 
isoschizomer of the methylation sensitive restriction endonuclease can create a 
1 5 sufficient overhang on the nucleic acid of interest. "Blunt ends" refers to a 
flush ending of two stands, the sense stand and the antisense strand, of a 
nucleic acid. 

Once a sufficient overhang is created on the nucleic acid of interest, an 
20 oligonucleotide is ligated to the nucleic acid cleaved of interest which has 

been cleaved by the methylation specific restriction endonuclease. "Ligation" 
is the attachment of two nucleic acid sequences by base pairing of 
substantially complementary sequences or by the formation of a covalent 
bonds between two nucleic acid sequences. An "oligonucleotide" is a nucleic 
25 acid sequence of 2 to 40 bases in length. Preferably the oligonucleotide is 
from 15 to 35 bases in length. In one embodiment, the oligonucleotide is 
ligated to the overhang on the nucleic acid sequence of interest by base 
pairing. 
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In one embodiment, two oligonucleotides are utilized to form an 
adaptor. An "adaptor" is a double-stranded nucleic acid sequence with one 
end that has a sufficient single- stranded overhang at one or both ends such that 
5 the adaptor can be ligated by base-pairing to a sufficient overhang on a nucleic 
acid of interest that has been cleaved by a methylation sensitive restriction 
enzyme or an isoschizomer of a methylation sensitive restriction enzyme. In 
one embodiment, two oligonucleotides can be used to form an adaptor; these 
oligonucleotides are substantially complementary over their entire sequence 

10 except for the region(s) at the 5' and/or 3 f ends that will form a single stranded 
overhang. The single stranded overhang is complementary to an overhang on 
the nucleic acid cleaved by a methylation sensitive restriction enzyme or an 
isoschizomer of a methylation sensitive restriction enzyme, such that the 
overhang on the nucleic acid of interest will base pair with the 3 1 or 5' single 

1 5 stranded end of the adaptor under appropriate conditions. The conditions will 
vary depending on the sequence composition (GC vs AT), the length, and the 
type of nucleic acid (see Sambrook et al, Molecular Cloning: a Laboratory 
Manual . 2nd Ed.; Cold Spring Harbor Laboratory Press, Plainview, NY, 
1998). 

20 

Following the ligation of the oligonucleotide, the nucleic acid of 
interest is amplified using a primer complementary to the oligonucleotide. 
Specifically, the term "primer" as used herein refers to a sequence comprising 
two or more deoxyribonucleotides or ribonucleotides, preferably more than 
25 three, and most preferably more than 8, which sequence is capable of initiating 
synthesis of a primer extension product, which is substantially complementary 
to a nucleic acid such as an adaptor or a ligated oligonucleotide. 
Environmental conditions conducive to synthesis include the presence of 
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nucleoside triphosphates and an agent for polymerization, such as DNA 
polymerase, and a suitable temperature and pH. The primer is preferably 
single stranded for maximum efficiency in amplification, but may be double 
stranded. If double stranded, the primer is first treated to separate its strands 
5 before being used to prepare extension products. In one embodiment, the 

primer is an oligodeoxyribo-nucleotide. The primer must be sufficiently long 
to prime the synthesis of extension products in the presence of the inducing 
agent for polymerization. The exact length of primer will depend on many 
factors, including temperature, buffer, and nucleotide composition. The 
10 oligonucleotide primer typically contains 12-20 or more nucleotides, although 
it may contain fewer nucleotides. 

Primers of the invention are designed to be "substantially" 
complementary to each strand of the oligonucleotide to be amplified and 

1 5 include the appropriate G or C nucleotides as discussed above. This means 
that the primers must be sufficiently complementary to hybridize with their 
respective strands under conditions which allow the agent for polymerization 
to perform. In other words, the primers should have sufficient 
complementarity with a 5' and 3' oligonucleotide to hybridize therewith and 

20 permit amplification of CpG containing nucleic acid sequence. 

Primers of the invention are employed in the amplification process 
which is an enzymatic chain reaction that produces exponential quantities of 
target locus relative to the number of reaction steps involved {e.g., polymerase 
25 chain reaction or PCR). Typically, one primer is complementary to the 

negative (-) strand of the locus and the other is complementary to the positive 
(+) strand. Annealing the primers to denatured nucleic acid followed by 
extension with an enzyme, such as the large fragment of DNA Polymerase I 
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(Klenow) and nucleotides, results in newly synthesized + and - strands 
containing the target locus sequence. Because these newly synthesized 
sequences are also templates, repeated cycles of denaturing, primer annealing, 
and extension results in exponential production of the region (/. e. , the target 
5 locus sequence) defined by the primer. The product of the chain reaction is a 
discrete nucleic acid duplex with termini corresponding to the ends of the 
specific primers employed. 

The oligonucleotide primers of the invention may be prepared using 
1 0 any suitable method, such as conventional phosphotriester and phosphodiester 
methods or automated embodiments thereof. In one such automated 
embodiment, diethylphosphoramidites are used as starting materials and may 
be synthesized as described by Beaucage, et ah {Tetrahedron Letters^ 
22 : 1859-1 862, 1981). One method for synthesizing oligonucleotides on a 
1 5 modified solid support is described in U.S. Patent No. 4,458,066. 

Where the CpG-containing nucleic acid sequence of interest contains 
two strands, it is necessary to separate the strands of the nucleic acid before it 
can be used as a template for the amplification process. Strand separation can 

20 be effected either as a separate step or simultaneously with the synthesis of the 
primer extension products. This strand separation can be accomplished using 
various suitable denaturing conditions, including physical, chemical, or 
enzymatic means, the word "denaturing" includes all such means. One 
physical method of separating nucleic acid strands involves heating the nucleic 

25 acid until it is denatured. Typical heat denaturation may involve temperatures 
ranging from about 80° to 105°C for times ranging from about 1 to 10 minutes. 
Strand separation may also be induced by an enzyme from the class of 
enzymes known as helicases or by the enzyme RecA, which has helicase 
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activity, and in the presence of riboATP, is known to denature DNA. The 
reaction conditions suitable for strand separation of nucleic acids with 
helicases are described by Kuhn Hoffmann-Berling (CSH-Quantitative 
Biology, 43:63, 1978) and techniques for using RecA are reviewed in 
5 C. Radding (Ann. Rev. Genetics, 16:405-437, 1982). 

When complementary strands of nucleic acid or acids are separated, 
regardless of whether the nucleic acid was originally double or single 
stranded, the separated strands are ready to be used as a template for the 
10 synthesis of additional nucleic acid strands. This synthesis is performed under 
conditions allowing hybridization of primers to templates to occur. Generally 
synthesis occurs in a buffered aqueous solution, generally at a pH of about 7- 

o 

9. Preferably, a molar excess (for genomic nucleic acid, usually about 10:1 
primer: template) of the two oligonucleotide primers is added to the buffer 

1 5 containing the separated template strands. It is understood, however, that the 
amount of complementary strand may not be known if the process of the 
invention is used for diagnostic applications, so that the amount of primer 
relative to the amount of complementary strand cannot be determined with 
certainty. As a practical matter, however, the amount of primer added will 

20 generally be in molar excess over the amount of complementary strand 

(template) when the sequence to be amplified is contained in a mixture of 
complicated long-chain nucleic acid strands. A large molar excess is preferred 
to improve the efficiency of the process. 

25 The deoxyribonucleoside triphosphates dATP, dCTP, dGTP, and dTTP 

are added to the synthesis mixture, either separately or together with the 
primers, in adequate amounts and the resulting solution is heated to about 90°- 
100°C from about 1 to 10 minutes, preferably from 1 to 4 minutes. After this 
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heating period, the solution is allowed to cool to approximately room 
temperature, which is preferable for the primer hybridization. To the cooled 
mixture is added an appropriate agent for effecting the primer extension 
reaction (called herein "agent for polymerization"), and the reaction is allowed 
5 to occur under conditions known in the art. The agent for polymerization may 
also be added together with the other reagents if it is heat stable. This 
synthesis (or amplification) reaction may occur at room temperature up to a 
temperature above which the agent for polymerization no longer functions. 
Thus, for example, if DNA polymerase is used as the agent, the temperature is 
1 0 generally no greater than about 40°C. Most conveniently the reaction occurs 
at room temperature. 

The agent for polymerization may be any compound or system which 
will function to accomplish the synthesis of primer extension products, 

1 5 including enzymes. Suitable enzymes for this purpose include, for example, 
E. coli DNA polymerase I, Klenow fragment of E. coli DNA polymerase I, T4 
DNA polymerase, other available DNA polymerases, polymerase muteins, 
reverse transcriptase, and other enzymes, including heat-stable enzymes (i.e. 9 
those enzymes which perform primer extension after being subjected to 

20 temperatures sufficiently elevated to cause denaturation). Suitable enzymes 
will facilitate combination of the nucleotides in the proper manner to form the 
primer extension products which are complementary to each locus nucleic acid 
strand. Generally, the synthesis will be initiated at the 3' end of each primer 
and proceed in the 5 f direction along the template strand, until synthesis 

25 terminates, producing molecules of different lengths. There may be agents for 
polymerization, however, which initiate synthesis at the 5' end and proceed in 
the other direction, using the same process as described above. 
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Preferably, the method of amplifying is by PCR, as described herein 
and as is commonly used by those of ordinary skill in the art. However, 
alternative methods of amplification have been described and can also be 
employed. 

5 

Once amplified, the nucleic acid can be attached to a solid support, 
such as a membrane, and can be hybridized with any probe of interest, to 
detect any nucleic acid sequence. Several membranes are known to one of 
skill in the art for the adhesion of nucleic acid sequences. Specific non- 

1 0 limiting examples of these membranes include nitrocellulose (Nitropure) or 
other membranes used in for detection of gene expression such as 
polyvinylchloride, diazotized paper and other commercially available 
membranes such as Genescreen™, Zetaprobe™ (Biorad), and Nytran™. 
Methods for attaching nucleic acids to these membranes are well known to one 

15 of skill in the art. Alternatively, screening can be done in a liquid phase. 

In nucleic acid hybridization reactions, the conditions used to achieve a 
particular level of stringency will vary, depending on the nature of the nucleic 
acids being hybridized. For example, the length, degree of complementarity, 
20 nucleotide sequence composition (e.g., GC v. AT content), and nucleic acid 

type (e.g., RNA v. DNA) of the hybridizing regions of the nucleic acids can be 
considered in selecting hybridization conditions. An additional consideration 
is whether one of the nucleic acids is immobilized, for example, on a filter. 

25 An example of progressively higher stringency conditions is as 

follows: 2 x SSC/0.1% SDS at about room temperature (hybridization 
conditions); 0.2 x SSC/0.1% SDS at about room temperature (low stringency 
conditions); 0.2 x SSC/0.1% SDS at about 42°C (moderate stringency 
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conditions); and 0.1 x SSC at about 68°C (high stringency conditions). 
Washing can be carried out using only one of these conditions, e.g., high 
stringency conditions, or each of the conditions can be used, e.g., for 10-15 
minutes each, in the order listed above, repeating any or all of the steps listed. 
5 However, as mentioned above, optimal conditions will vary, depending on the 
particular hybridization reaction involved, and can be determined empirically. 
In general, conditions of high stringency are used for the hybridization of the 
probe of interest. 

1 0 The probe of interest can be detectably labeled, for example, with a 

radioisotope, a fluorescent compound, a bioluminescent compound, a 
chemiluminescent compound, a metal chelator, or an enzyme. Those of 
ordinary skill in the art will know of other suitable labels for binding to the 
probe, or will be able to ascertain such, using routine experimentation. 

15 

In one embodiment, representational difference analysis (RDA, see 
Lisitsyn et aL, Science 252:946-951, 1993, herein incorporated by reference) 
can be performed on CpG-containing nucleic acid following MCA. MCA 
utilizes kinetic and subtractive enrichment to purify restriction endonuclease 

20 fragments present in one population of nucleic acid fragments but not in 

another. Thus, RDA enables the identification of small differences between 
the sequences of two nucleic acid populations. RDA uses nucleic acid from 
one population as a "tester" and nucleic acid from a second population as a 
"driver," in order to clone probes for single copy sequences present in (or 

25 absent from) one of the two populations. In one embodiment, nucleic acid 
from a "normal" individual or sample, not having a disorder such as a cell- 
proliferative disorder is used as a "driver," and nucleic acid from an "affected" 
individual or sample, having the disorder such as a cell proliferative disorder 
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is used as a "tester/' In one embodiment, the nucleic acid used as a "tester" is 
isolated from an individual having a cell proliferative disorder such as colon 
cancer, lung cancer, renal cancer, leukemia, breast cancer, prostate cancer, 
uterine cancer, astrocytoma, glioblastoma, and neuroblastoma. The nucleic 
5 acid used as a "driver" is thus normal colon, normal lung, normal kidney, 
normal blood cells, normal breast, normal prostate, normal uterus, normal 
astrocytes, normal glial and normal neurons, respectively. In an additional 
embodiment, the nucleic acid used as a "driver" is isolated from an individual 
having a cell proliferative disorder such as colon cancer, lung cancer, renal 

1 0 cancer, leukemia, breast cancer, prostate cancer, uterine cancer, astrocytoma, 
glioblastoma, and neuroblastoma. The nucleic acid used as a "tester" is thus 
normal colon, normal lung, normal kidney, normal blood cells, normal breast, 
normal prostate, normal uterus, normal astrocytes, normal glial and normal 
neurons, respectively. One of skill in the art will readily be able to identify the 

1 5 "tester" nucleic acid useful with to identify methylated nucleic acid sequences 
in given "driver" population. 
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SCREENTNG AGENTS FOR AN EFFECT ON METHYLATION 

The invention provides a method for identifying an agent which can 
affect methylation. An agent can affect methylation by either increasing or 
5 decreasing methylation. The method includes incubating an agent and a 
sample containing a CpG-containing polynucleotide under conditions 
sufficient to allow the components to interact, and measuring the effect of the 
compound on the methylation of the CpG-containing nucleic acid. In one 
embodiment, the sample is a cell expressing a polynucleotide of interest. In 

1 0 another embodiment, the sample is substantially purified nucleic acid. 

"Substantially purified M nucleic acid is nucleic acid which has been separated 
from the cellular components which naturally accompany it, or from 
contaminating elements such as proteins, lipids, or chemical resins. 
Substantially pure nucleic acid can be extracted from any cell type, or can be 

1 5 chemically synthesized. Purity can be measured by any appropriate method, 
such as measuring the absorbance of light (e.g., A260/A28O ratio). 

The nucleic acid can be identified by the methylated CpG island 
amplification, as described above. The methylation of the polynucleotide in 

20 the sample can then be compared to the methylation of a control sample not 
incubated with the agent. The effect of the agent on methylation of a 
polynucleotide can be measured by assessing the methylation of the 
polynucleotide by the methods of the invention. Alternatively, the effect of 
the agent on methylation of a polynucleotide can be measured by assessing the 

25 expression of the polynucleotide of interest. Means of measuring expression 
are well known to one of skill in the art (e.g., Northern blotting or RNA dot 
blotting, amongst others). 
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The agents which affect methylation can include peptides, 
peptidomimetics, polypeptides, pharmaceuticals, and chemical compounds 
and biological agents. Psychotropic, antiviral, and chemotherapeutic 
compounds can also be tested using the method of the invention. 

5 

"Incubating" includes conditions which allow contact between the test 
agent and the cell of interest. "Contacting" includes in solution and solid 
phase. The test agent may also be a combinatorial library for screening a 
plurality of compounds. Agents identified in the method of the invention can 

10 be further cloned, sequenced, and the like, either in solution of after binding to 
a solid support, by any method usually applied to the isolation of a specific 
DNA sequence Molecular techniques for DNA analysis (Landegren et al. , 
Science 242:229-237, 1988) and cloning have been reviewed (Sambrook et al, 
Molecular Cloning: a Laboratory Manual . 2nd Ed.; Cold Spring Harbor 

15 Laboratory Press, Plainview, NY, 1998. 

The sample can be any sample of interest. The sample may be a cell 
sample or a membrane sample prepared from a cell sample. Suitable cells 
include any host cells containing a nucleic acid including a CpG island. The 
20 cells can be primary cells or cells of a cell line. 

In one embodiment, the agent is incubated with the sample of interest 
suspected of including a CpG-containing nucleic acid and methylation is 
evaluated by MCA. Thus, nucleic acid from the sample suspected of 
25 including a CpG-containing nucleic acid is contacted with a methylation 

sensitive restriction endonuclease which cleaves only unmethylated CpG sites 
under conditions and for a time to allow cleavage of unmethylated nucleic 
acid. An isoschizomer of the methylation sensitive restriction endonuclease is 
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also utilized. An oligonucleotide is added to the nucleic acid sample under 
conditions and for a time to allow ligation of the oligonucleotide to nucleic 
acid cleaved by said restriction endonuclease, and the digested nucleic acid is 
amplified. The digested nucleic acid is adhered to a membrane, and the 
5 membrane is hybridized with a probe of interest. In one embodiment, 
representation difference analysis can also be performed. 

KITS 

10 The materials for use in the assay of the invention are ideally suited for 

the preparation of a kit. Such a kit may comprise a carrier means containing 
one or more container means such as vials, tubes, and the like, each of the 
container means comprising one of the separate elements to be used in the 
method. One of the container means can comprise a container containing an 

1 5 oligonucleotide for ligation to nucleic acid cleaved by a methylation sensitive 
restriction endonuclease. One or more container means can also be included 
comprising a primer complementary to the oligonucleotide. In addition, one 
or more container means can also be included which comprise a methylation 
sensitive restriction endonuclease. One or more container means can also be 

20 included containing an isoschizomer of said methylation sensitive restriction 
enzyme. 

In another embodiment, the kit may comprise a carrier means 
containing one or more container means comprising a solid support, wherein 
25 the solid support has a nucleic acid sequence selected from the group 
consisting of MINT 1-3 3 immobilized on the solid support. In one 
embodiment, the solid support is a membrane. Several membranes are known 
to one of skill in the art for the adhesion of nucleic acid sequences. Specific 
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non-limiting examples of these membranes include nitrocellulose (Nitropure) 
or other membranes used in for detection of gene expression such as 
polyvinylchloride, diazotized paper and other commercially available 
membranes such as Genescreen™, Zetaprobe™ (Biorad), and Nytran™. The 
5 MINT 1-33 sequences immobilized on the solid support can then be hybridized 
to nucleic acid sequences produced by performing the MCA procedure on the 
nucleic acids of a sample of interest in order to determine if the nucleic acid 
sequences contained in the sample are methylated. 

10 POLYNUCLEOTIDES AND POLYPEPTIDES 

In another embodiment, the invention provides isolated MINT1, 
MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, MINT14, MINT15, 
MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, 
MINT30, MINT31, MINT32 and MINT33 polynucleotides ( SEQ ID NO:l, 

15 SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 9, 
SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID 
NO: 19, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 
SEQ ID NO:27, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, and SEQ ID 
NO:33, respectively). These polynucleotides include DNA, cDNA and RNA 

20 sequences which encode MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, 
MINT10, MINT 14, MINT 15, MINT 17, MINT 19, MINT20, MINT22, 
MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and MINT33 
polypeptides. It is understood that naturally occurring, synthetic, and 
intentionally manipulated polynucleotides are included. For example, 

25 MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, MINT 14, 
MINT15, MINT 17, MINT19, MINT20, MINT22, MINT23, MINT24, 
MINT27, MINT30, MINT31, MINT32 and MINT33 nucleic acids may be 
subjected to site-directed mutagenesis. The nucleic acid sequence for MINT1, 
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MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, MINT14, MINT15, 
MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, 
MINT30, MINT31, MINT32 and MINT33 also includes antisense sequences, 
and sequences encoding dominant negative forms of MINT 1, MINT2, 
5 MINT4, MINT6, MINT 8, MINT 9, MINT10, MINT 14, MINT 15, MINT 17, 
MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 
MINT3 1 , MINT32 and MINT33 . 

The invention provides methylated and unmethylated forms of MINT 1, 
10 MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, MINT14, MINT15, 
MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, 
MINT30, MINT31, MINT32 and MINT33 polynucleotides ( SEQ ID NO:l, 
SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 9, 
SEQ ID NO:10, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:17, SEQ ID 
1 5 NO: 19, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, 

SEQ ID NO:27, SEQ ID NO:30, SEQ ID NO:3 1, SEQ ID NO:32, and SEQ ID 
NO:33, respectively). Methylated nucleic acid sequences are also provided 
which include MINT3, MINT5, MINT 7, MINT11, MINT12, MINT13, 
MINT 16, MINT18, MINT21, MINT25, MINT26, MINT28, and MINT29 
20 (SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO: 1 1 , SEQ ID 
NO:12, SEQ ID NO:13, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:21, 
SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:28, and SEQ ID NO:29, 
respectively). It is understood that naturally occurring, synthetic, and 
intentionally manipulated polynucleotides are included. 



The polynucleotides of the invention includes "degenerate variants" 
sequences that are degenerate as a result of the genetic code. There are 20 
natural amino acids, most of which are specified by more than one codon. 
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Therefore, all degenerate nucleotide sequences are included in the invention as 
long as the amino acid sequence of a polypeptide encoded by the nucleotide 
sequence of SEQ ID NOs: 1-33 is functionally unchanged. 

5 Specifically disclosed herein are methylated and unmethylated isolated 

polynucleotide sequences of MINT1, MINT2, MINT4, MINT6, MINT8, 
MINT 9, MINT10, MINT14, MINT15, MINT 17, MINT19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and 
MINT33. Preferably, the nucleotide sequence is SEQ ID NO:l, SEQ ID 

10 NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 9, SEQ ID 
NO: 10, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, 
SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:27, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, and SEQ ID NO:33, 
respectively. Specifically disclosed herein are methylated isolated 

15 polynucleotide sequences of MINT3, MINTS, MINT 7, MINT11, MINT12, 
MINT13, MINT16, MINT18, MINT21, MINT25, MINT26, MINT28, and 
MINT29. Preferably, the nucleotide sequence is SEQ ID NO:3, SEQ ID 
NO:5, SEQ ID NO:7, SEQ ID NO:l 1, SEQ ID NO: 12, SEQ ID NO: 13, SEQ 
ID NO: 16, SEQ ID NO: 18, SEQ ID NO:21, SEQ ID NO:25, SEQ ID NO:26, 

20 SEQ ID NO:28, and SEQ ID NO:29, respectively. The term "polynucleotide" 
or "nucleic acid sequence" refers to a polymeric form of nucleotides at least 
10 bases in length. By "isolated polynucleotide" is meant a polynucleotide 
that is not immediately contiguous with both of the coding sequences with 
which it is immediately contiguous (one on the 5' end and one on the 3 f end) in 

25 the naturally occurring genome of the organism from which it is derived. The 
term therefore includes, for example, a recombinant DNA which is 
incorporated into a vector; into an autonomously replicating plasmid or virus; 
or into the genomic DNA of a prokaryote or eukaryote, or which exists as a 
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separate molecule (e.g., a cDNA) independent of other sequences. The 
nucleotides of the invention can be ribonucleotides, deoxyribonucleotides, or 
modified forms of either nucleotide. The term includes single and double 
forms of DNA. 

5 

The polynucleotide encoding MINT1, MINT2, MINT4, MINT6, 
MINT8, MINT 9, MINT10, MINT14, MINT15, MINT 17, MINT19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and 
MINT33 includes SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:4, SEQ ID 

10 NO:6, SEQ ID NO:8, SEQ ID NO: 9, SEQ ID NO:10, SEQ ID NO: 14, SEQ 
ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:22, 
SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:27, SEQ ID NO:30, SEQ ID 
NO:31, SEQ ID NO:32, and SEQ ID NO:33, dominant negative forms of 
MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT 10, MINT 14, 

15 MINT15, MINT 17, MINT19, MINT20, MINT22, MINT23, MINT24, 
MINT27, MINT30, MINT31, MINT32 and MINT33, and nucleic acid 
sequences complementary to SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:4, 
SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 9, SEQ ID NO:10, SEQ ID 
NO:14, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:20, 

20 SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:27, SEQ ID 
NO:30, SEQ ID NO:31, SEQ ID NO:32, and SEQ ID NO:33. A 
complementary sequence may include an antisense nucleotide. When the 
sequence is RNA, the deoxynucleotides A, G, C, and T of SEQ ID NO: 1, SEQ 
ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 9, SEQ 

25 ID NO:10, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, 
SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:27, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, and SEQ ID NO:33 
are replaced by ribonucleotides A, G, C, and U, respectively. Also included in 
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the invention are fragments of the above-described nucleic acid sequences that 
are and are at least 1 5 bases in length, which is sufficient to permit the 
fragment to selectively hybridize to DNA that encoded by SEQ ID NO: 1, SEQ 
ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 9, SEQ 
5 ID NO:10, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, 
SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:27, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, and SEQ ID NO:33 
under physiological conditions or a close family member of MINT1, MINT2, 
MINT4, MINT6, MINT8, MINT 9, MINT 10, MINT 14, MINT 15, MINT17, 
1 0 MINT1 9, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 

MINT3 1, MINT32 and MINT33. The term "selectively hybridize" refers to 
hybridization under moderately or highly stringent conditions which excludes 
non-related nucleotide sequences. Hybridization conditions have been 
described above. 

15 

The MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, 
MINT14, MINT15, MINT 17, MINT19, MINT20, MINT22, MINT23, 
MINT24, MINT27, MINT30, MINT31, MINT32, and MINT33 nucleotide 
sequence includes the disclosed sequence and conservative variations of the 

20 polypeptides encoded by MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, 
MINT 10, MINT14, MINT15, MINT 17, MINT19, MINT20, MINT22, 
MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and MINT33 
polynucleotides. The term "conservative variation" as used herein denotes the 
replacement of an amino acid residue by another, biologically similar residue. 

25 Examples of conservative variations include the substitution of one 

hydrophobic residue such as isoleucine, valine, leucine or methionine for 
another, or the substitution of one polar residue for another, such as the 
substitution of arginine for lysine, glutamic for aspartic acid, or glutamine for 
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asparagine, and the like. The term "conservative variation" also includes the 
use of a substituted amino acid in place of an unsubstituted parent amino acid 
provided that antibodies raised to the substituted polypeptide also 
immunoreact with the unsubstituted polypeptide. 

5 

MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT 10, 
MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, 
MINT24, MINT27, MINT30, MINT31, MINT32 and MINT33 nucleic acid 
sequences can be expressed in vitro by DNA transfer into a suitable host cell. 

1 0 "Host cells" are cells in which a vector can be propagated and its DNA 

expressed. The cell may be prokaryotic or eukaryotic. The term also includes 
any progeny of the subject host cell. It is understood that all progeny may not 
be identical to the parental cell since there may be mutations that occur during 
replication. However, such progeny are included when the term "host cell" is 

1 5 used. Methods of stable transfer, meaning that the foreign DNA is 
continuously maintained in the host, are known in the art. 

In one aspect, the MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, 
MINTIO, MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, 

20 MINT23, MINT24, MINT27, MINT30, MINT3 1 , MINT32 and MINT33 

polynucleotide sequences may be inserted into an expression vector. The term 
"expression vector" refers to a plasmid, virus or other vehicle known in the art 
that has been manipulated by insertion or incorporation of the sequence of 
interest genetic sequences. Polynucleotide sequence which encode sequence 

25 of interest can be operatively linked to expression control sequences. 

"Operatively linked" refers to a juxtaposition wherein the components so 
described are in a relationship permitting them to function in their intended 
manner. An expression control sequence operatively linked to a coding 
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sequence is ligated such that expression of the coding sequence is achieved 
under conditions compatible with the expression control sequences. As used 
herein, the term "expression control sequences" refers to nucleic acid 
sequences that regulate the expression of a nucleic acid sequence to which it is 
5 operatively linked. Expression control sequences are operatively linked to a 
nucleic acid sequence when the expression control sequences control and 
regulate the transcription and, as appropriate, translation of the nucleic acid 
sequence. Thus expression control sequences can include appropriate 
promoters, enhancers, transcription terminators, a start codon (/. e. , ATG) in 

1 0 front of a protein-encoding gene, splicing signal for introns, maintenance of 
the correct reading frame of that gene to permit proper translation of mRNA, 
and stop codons. The term "control sequences" is intended to included, at a 
minimum, components whose presence can influence expression, and can also 
include additional components whose presence is advantageous, for example, 

1 5 leader sequences and fusion partner sequences. Expression control sequences 
can include a promoter. 

By "promoter" is meant minimal sequence sufficient to direct 
transcription. Also included in the invention are those promoter elements 

20 which are sufficient to render promoter-dependent gene expression 

controllable for cell-type specific, tissue-specific, or inducible by external 
signals or agents; such elements may be located in the 5 f or 3* regions of the 
gene. Both constitutive and inducible promoters, are included in the 
invention (see, e.g., Bitter et al, Methods in Enzymology 1^1:516-544, 1987). 

25 For example, when cloning in bacterial systems, inducible promoters such as 
pL of bacteriophage y, plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like 
may be used. When cloning in mammalian cell systems, promoters derived 
from the genome of mammalian cells (e.g., metallothionein promoter) or from 
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mammalian viruses (e.g., the retrovirus long terminal repeat; the adenovirus 
late promoter; the vaccinia virus 7.5K promoter) may be used. Promoters 
produced by recombinant DNA or synthetic techniques may also be used to 
provide for transcription of the nucleic acid sequences of the invention. 

5 

In the present invention, the MINT1, MINT2, MINT4, MINT6, 
MINT8, MINT 9, MINTIO, MINT14, MINT15, MINT 17, MINT 19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 and 
MINT33 polynucleotide sequence may be inserted into an expression vector 

1 0 which contains a promoter sequence which facilitates the efficient 

transcription of the inserted genetic sequence of the host. The expression 
vector typically contains an origin of replication, a promoter, as well as 
specific genes which allow phenotypic selection of the transformed cells. 
Vectors suitable for use in the present invention include, but are not limited to 

1 5 the T7-based expression vector for expression in bacteria (Rosenberg et al. , 
Gene 5£:125, 1987), the pMSXND expression vector for expression in 
mammalian cells (Lee and Nathans, J. Biol. Chem. 262:3521, 1988) and 
baculo virus-derived vectors for expression in insect cells. The DNA segment 
can be present in the vector operably linked to regulatory elements, for 

20 example, a promoter (e.g. , T7, metallothionein I, or polyhedron promoters). 

MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINTIO, 
MINT14, MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, 
MINT24, MINT27, MINT30, MINT31, MINT32 and MINT33 polynucleotide 
25 sequences can be expressed in either prokaryotes or eukaryotes. Hosts can 
include microbial, yeast, insect and mammalian organisms. Methods of 
expressing DNA sequences having eukaryotic or viral sequences in 
prokaryotes are well known in the art. Biologically functional viral and 
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plasmid DNA vectors capable of expression and replication in a host are 
known in the art. Such vectors are used to incorporate DNA sequences of the 
invention. 

5 By : transformation" is meant a genetic change induced in a cell 

following incorporation of new DNA (i.e., DNA exogenous to the cell). 
Where the cell is a mammalian cell, the genetic change is generally achieved 
by introduction of the DNA into the genome of the cell (i.e., stable). 

10 By "transformed cell" is meant a cell into which (or into an ancestor of 

which) has been introduced, by means of recombinant DNA techniques, a 
DNA molecule encoding sequence of interest. Transformation of a host cell 
with recombinant DNA may be carried out by conventional techniques as are 
well known to those skilled in the art. Where the host is prokaryotic, such as 

15 E. coli, competent cells which are capable of DNA uptake can be prepared 
from cells harvested after exponential growth phase and subsequently treated 
by the CaCh method using procedures well known in the art. Alternatively, 
MgCb or RbCl can be used. Transformation can also be performed after 
forming a protoplast of the host cell if desired. 

20 

When the host is a eukaryote, such methods of transfection of DNA as 
calcium phosphate co-precipitates, conventional mechanical procedures such 
as microinjection, electroporation, insertion of a plasmid encased in 
liposomes, or virus vectors may be used. Eukaryotic cells can also be 
25 cotransformed with DNA sequences encoding the sequence of interest, and a 
second foreign DNA molecule encoding a selectable phenotype, such as the 
herpes simplex thymidine kinase gene. Another method is to use a eukaryotic 
viral vector, such as simian virus 40 (SV40) or bovine papilloma virus, to 
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transiently infect or transform eukaryotic cells and express the protein (see for 
example, Eukaryotic Viral Vectors . Cold Spring Harbor Laboratory, Gluzman 
ed., 1982). 

5 Isolation and purification of microbial expressed polypeptide, or 

fragments thereof, provided by the invention, may be carried out by 
conventional means including preparative chromatography and immunological 
separations involving monoclonal or polyclonal antibodies. 

10 In one embodiment, the invention provides substantially purified 

polypeptide encoded by MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, 
MINT 10, MINT 14, MINT 15, MINT 17, MINT 19, MINT20, MINT22, 
MINT23, MINT24, MINT27, MINT30, MINT3 1 , MINT32 and MINT33 
polynucleotide sequences. The term "substantially purified" as used herein 

1 5 refers to a polypeptide which is substantially free of other proteins, lipids, 
carbohydrates or other materials with which it is naturally associated. One 
skilled in the art can purify a polypeptide encoded by MINT1, MINT2, 
MINT4, MINT6, MINT8, MINT 9, MINT 10, MINT 14, MINT15, MINT17, 
MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 

20 MINT3 1 , MINT32 and MINT33 polynucleotide sequence using standard 
techniques for protein purification. The substantially pure polypeptide will 
yield a single major band on a non-reducing poly aery lamide gel. The purity of 
the MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, MINT14, 
MINT 15, MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, 

25 MINT27, MINT30, MINT3 1, MINT32 and MINT33 polypeptide can also be 
determined by amino-terminal amino acid sequence analysis. 
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Minor modifications of the MINT1, MINT2, MINT4, MINT6, MINT8, 
MINT 9, MINT 10, MINT 14, MINT15, MINT 17, MINT 19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, MINT32, and 
MINT33 primary amino acid sequences may result in proteins which have 
5 substantially equivalent activity as compared to the unmodified counterpart 
polypeptide described herein. Such modifications may be deliberate, as by 
site-directed mutagenesis, or may be spontaneous. All of the polypeptides 
produced by these modifications are included herein as long as the biological 
activity still exists. 

10 

The polypeptides of the invention also include dominant negative 
forms of the MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, 
MINT 14, MINT15, MINT1 7, MINT 19, MINT20, MINT22, MINT23, 
MINT24, MINT27, MINT30, MINT31, MINT32 or MINT33 polypeptide 

1 5 which do not have the biological activity of MINT1 , MINT2, MINT4, 

MINT6, MINT8, MINT 9, MINT10, MINT14, MINT15, MINT17, MINT19, 
MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, 
MINT32 or MINT33 polynucleotide sequence. A "dominant negative form" 
of MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, MINT14, 

20 MINT15, MINT17, MINT19, MINT20, MINT22, MINT23, MINT24, 

MINT27, MINT30, MINT31, MINT32, or MINT33 is a polypeptide that is 
structurally similar to MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, 
MINT 10, MINT 14, MINT15, MINT 17, MINT19, MINT20, MINT22, 
MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 or MINT33 

25 polypeptide but does not have wild-type MINT1 , MINT2, MINT4, MINT6, 

MINT8, MINT 9, MINT10, MINT14, MINT15, MINT17, MINT19, MINT20, 
MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, MINT32 or 
MINT33 function. For example, a dominant-negative MINT1, MINT2, 



WO 00/26401 



PCT/US99/25251 - 



-39- 

MINT4, MINT6, MINT8, MINT 9, MINT10, MINT14, MINT15, MINT 17, 
MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 
MINT31, MINT32 or MINT33 polypeptide may interfere with wild-type 
MINT1, MINT2, MINT4, MINT6, MINT8, MINT 9, MINT10, MINT14, 
5 MINT 15, MINT1 7, MINT1 9, MINT20, MINT22, MINT23, MINT24, 

MINT27, MINT30, MINT31, MINT32 or MINT33 function by binding to, or 
otherwise sequestering, regulating agents, such as upstream or downstream 
components, that normally interact functionally with the MINT1, MINT2, 
MINT4, MINT6, MINT 8, MINT 9, MINT 10, MINT14, MINT15, MINT17, 
10 MINT19, MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, 
MINT31, MINT32 or MINT33 polypeptide. 

EXAMPLES 

The following examples are intended to illustrate but not to limit the 
1 5 invention in any manner, shape, or form, either explicitly or implicitly. While 
they are typical of those that might be used, other procedures, methodologies, 
or techniques known to those skilled in the art may alternatively be used. 

E X AMP LE I 

20 DETECTION OF METHYLATED CPG ISLANDS USING MCA 

The principle underlying MCA involves amplification of closely 
spaced methylated Smal sites to enrich for methylated CGIs. The MCA 
technique is outlined in Figure 1 A. About 70 to 80% of CpG islands contain 
at least two closely spaced (<lkb) Smal sites (CCCGGG). Only those Smal 
25 sites within these short distances can be amplified using MCA, ensuring 

representation of the most CpG rich sequences. Briefly, DNA is digested with 
Smal, which cleaves only unmethylated sites, leaving blunt ends between the 
C and G. DNA is then digested with the Smal isoschizomer Xmal, which 
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does cleave methylated CCCGGG sites, and which leaves a 4 base overhang. 
Adaptors are ligated to this overhang, and PCR is performed using primers 
complementary to these adaptors. The amplified DNA is then spotted on a 
nylon membrane and can be hybridized with any probe of interest. 

5 

As a model experiment, amplification of the pi 6 gene CGI was 
examined because (1) hypermethylation of this CGI in cancer is well 
characterized, and correlates with silencing of the gene (Herman et ai 9 1995), 
and (2) this CGI contains two closely spaced Smal sites (400bp) which can be 

1 0 amplified by MCA. Initially, the reaction was optimized by testing different 
primers with a variable GC content, and different PCR conditions. As shown 
in Figure IB, using primers with a 70% GC content, the pi 6 CGI is amplified 
strongly in the Caco2 cell line, where it is known to be hypermethylated, while 
no signal above background was detected from any normal colon mucosa. To 

1 5 examine the quantitative aspect of MCA, DNA from Caco2 and normal colon 
mucosa were mixed in various proportions, and the methylation level of each 
mix was determined using MCA. MCA detected pi 6 methylation in a semi- 
quantitative manner between 1% and 100% methylated alleles. Finally, MCA 
was performed on 109 samples of normal colonic mucosa and adjacent 

20 primary colorectal tumor that had previously been typed for pi 6 methylation 
by Southern blot analysis (Ahuja et aL, 1997). MCA and Southern blot were 
concordant in 107/109 (98%) of the cases. In one case, MCA detected a low 
level of methylation (5-10 %) in a cancer sample that had been judged 
negative by Southern blot. In the other discordant case (positive by MCA, 

25 negative by Southern blot), the discordance may be related to heterogeneous 
pi 6 methylation, as has been described (Costello et al, 1996). 
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MCA is a novel PCR-based technique that allows for the rapid 
enrichment of hypermethylated CG rich sequences, with a high representation 
of methylated CpG islands. This technique can have several potential 
applications. MCA is very useful for the determination of the methylation 
5 status of a large number of samples at multiple loci simultaneously. By 

optimizing the PCR conditions, it should be readily adaptable to the study of 
the methylation status of any gene that has two closely spaced Smal sites. As 
shown herein, there is a very high concordance rate between MCA and other 
methods for the detection of hypermethylation such as Southern blot analysis 

10 and bisulfite-based methods. However, MCA (1) requires good quality DNA, 
excluding the study of paraffin-embedded samples, (2) examines only a 
limited number of CpG sites within a CGI and (3) is sensitive to incomplete 
digestion using the methylation-sensitive enzyme Smal. Nevertheless, many 
steps in MCA are amenable to automation and, by allowing for the 

1 5 examination of multiple genes relatively quickly, may have important 
applications in population-based studies of CGI methylation. 



E XA M PLE 2 

IDENTIFICATION OF DIFFERENTIALLY METHYLATED CG 

20 IN CRC BY M CA/ RDA 

To identify novel CGIs aberrantly methylated in CRC, RDA (Lisitsyn 
et aL 9 1993) was performed on MCA amplicons from the colon cancer cell 
line Caco2 as a tester, and a mixture of DNA from the normal colon mucosa of 
5 different men (to avoid cloning polymorphic Smal sites or inactive and 

25 methylated X chromosome genes from women) as a driver. Two separate 

experiments were conducted, one using a lower annealing temperature (72/C), 
and the other using a higher annealing temperature (77/C) and more GC rich 
primers. After two rounds of RDA, the PCR products were cloned, and 
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colonies containing inserts were identified by PGR. Based on initial 
experiments, we expected most of the recovered clones to contain Alu 
repetitive sequences, which are CG rich and hypermethylated (Kochanek et 
aL, 1993). All clones were therefore probed with an Alu fragment, and only 
5 non-hybridizing clones were analyzed further. Out of 1 60 non-Alu clones, 46 
were independent clones and 33 of these (MINT 1-33, Methylated in JIumors, 
SEQ ID NOs:l-33, respectively) appeared to be differentially methylated in 
Caco2 cells by comparing hybridization to MCA products from Caco2 and 
normal colon (Figure 1C). 19 of the clones (MINT1-19) were obtained using 
10 the lower annealing temperature, and 14 (MINT 20-33) using the higher 
temperature. 

To confirm the aberrant methylation of these clones, Southern blot 
analysis was performed using DNA digested with Smal or XmaL All of the 

15 33 clones were hypermethylated in Caco2 compared to normal colon mucosa. 
Of these 33, one clone (MINT13) detected highly repeated sequences and two 
clones (MINT 18 and MINT28) appeared to correspond to mildly repeated 
gene families (data not shown). All others appeared to detect single copy 
DNA fragments. In addition, hypermethylation at CpG sites within the clones 

20 and distinct from the Smal sites was confirmed by bisulfite-PCR for 6 clones. 
In each case, Caco2 was found to be hypermethylated at these sites. 

By DNA sequencing (example shown in Figure 2), we found that 29 
clones had a GC content greater than 50%, and satisfied the minimal criteria 
25 for CGIs (200bp, GC content>50%, CpG/GpC>0.5) (Gardiner- Garden and 
Frommer, 1987). As might be expected, clones obtained with the higher 
annealing temperature and more GC rich primers had a relatively higher GC 
content (Table 1). The size of each clone, percentage of GC nucleotide, 
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observed/expected CGs, sequence homology and, chromosomal location are 
summarized in Table 1. MINT5, MINT8, MINT1 1, MINT14 and MINT16 
contained GC rich regions only in one end of the clones, and these may have 
been recovered from the edge of CGIs. 
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Table 1: Summary of the 33 Differentially Methylated Clones Isolated by MCA-RDA 



Clone 


Size 
(bp) 


%GC 


O/E 


CGI 


Blast Homology 


Chromosome 
Map 


Methyla- 

tion 

Pattern 


MINT1 


528 


56 


0.6 


Yes 


None 


5ql3-14 


Type C 


MINT2 


562 


50 


0.8 


Yes 


None 


2p22-21 


TypeC 


MINT3 


563 


55 


1 


Yes 


Human EST AA557808 


lp34-35 


Type a 


MINT4 


481 


60 


0.8 


Yes 


None 


15q25-26 


Type a 


MINT5 


852 


46 


0.5 


Yes* 


Human CpG clone 88c 1 


14q21-22 


Type a 


MINT6 


401 


59 


0.6 


Yes 


None 


12ql4-15 


Type a 


MINT7 


481 


49 


0.9 


Yes 


Human genomic DNA 


6p21-22 


Type A 


MINT8 


617 


46 


0.5 


Yes* 


None 


N.D 


Type A 


MINT9 


605 


54 


0.3 


No 


None 


lp34-35 


Type A 


MINT10 


608 


49 


0.6 


Yes 


None 


9q34-Ter 


Type A 


MINT11 


637 


49 


0.6 


Yes* 


Versican 


5ql2-13 


Type A 


MINT12 


552 


49 


0.6 


Yes 


CpG clone 33h2 


7q31-32 


Type C 


MINT13 


308 


60 


0.9 


Yes 


LINE1 


N.D 


Cell line 


MINT14 


620 


54 


0.4 


Yes* 


None 


10pl3-15 


Type A 


MINT15 


641 


53 


0.7 


Yes 


None 


llpl2-13 


Type A 


MINT16 


664 


62 


0.5 


Yes* 


Alpha-tubulin 


2q 


Type A 


MINT 17 


491 


54 


0.7 


Yes 


None 


6 


TypeC 


MINT18 


435 


58 


0.1 


No 


Acrogranin 


N.D 


Cell line 


MINT 19 


443 


55 


0.2 


No 


None 


N.D 


Type A 


MINT20 


510 


67 


0.8 


Yes 


mouse OTP 


N.D 


Type A 


MINT21 


411 


62 


0.4 


No 


None 


22ql3 


Type A 


MINT22 


438 


60 


0.9 


Yes 


None 


10pl2 


Type A 


MINT23 


346 


64 


0.8 


Yes 


Csx 


5q34-35 


Type A 


MINT24 


525 


63 


0.7 


Yes 


None 


3p25-26 


Type A 


MINT25 


339 


60 


0.7 


Yes 


Human genomic DNA 


22qll 


TypeC 


MINT26 


591 


58 


0.8 


Yes 


CpG clone 73 el 


7qll 


Type A 


MINT27 


242 


74 


0.7 


Yes 


None 


N.D 


TypeC 


MINT28 


463 


58 


1 


Yes 


Ribosomal RNA gene 


N.D 


Type A 


MINT29 


429 


60 


0.7 


Yes 


CpG clone 20b 1 


7qll 


N.D 


MINT30 


536 


65 


0.5 


Yes 


None 


20qll 


Type A 


MINT31 


673 


65 


0.8 


Yes 


None 


17q21 


TypeC 


MINT32 


464 


66 


1 


Yes 


None 


20ql3 


Type A 


MINT33 


139 


65 


0.8 


Yes 


None 


N.D 


N.D 



O/E : Observed/expected numbers of CpGs. N.D : not determined. 



* Only one portion of the clones has a CpG island. 
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By DNA homology search using the BLAST program (BLAST 2.0, 
default parameters, see http://www.ncbi.nlm.nih.gov/cgi-bin/BLAST/nph-- 
newblast?J-form=0), 4 clones were identical to human gene sequences, four 
clones were identical to CGIs randomly sequenced from a CGI library (Cross 
5 et aL, 1994), one was identical to an EST, two clones were identical to high 
throughput genomic sequences deposited in Genbank, three clones had 
significant homology to other genes arid the other 1 9 had no significant match 
in the database; MINT1 1 was identical to exon 1 and intron 1 of the human 
versican gene (Zimmerman et aL, 1989), and corresponded to the 3' edge of a 

10 promoter associated CGI; MINT14 was identical to exon 1 of the human 

alpha-tubulin gene (Dobner, P.R., et al^ 1987), and was also the 3' edge of the 
CGI; MINT 24 corresponded to the 3' noncoding region of the human 
homeobox gene GStff (Turbay et al. 9 1996); MINT21 had a region with 94% 
homology at the nucleotide level to exon 2 of the mouse OPT gene (Simeone 

15 et aL, 1994) and probably represents the human homologue of this gene; 
MINT28 was homologous to ribosomal gene sequences; MINT 18 was 
homologous to the acrogranin gene family. To examine the presence of 
potential promoter sequences in these clones, promoter prediction was 
performed using several computer programs (see programs available at 

20 http://dot.imgen.bcm.tmc,edu:9331/seq-search/gene-search.html). Twenty out 
of the 33 clones were predicted as promoters using the NNPP program, and 6 
were predicted as promoters by using the TSSG program. 



The chromosomal position of most of the unknown clones was 
25 determined using a somatic cell hybrid panel and a radiation hybrid panel 
(Table 1). Of note, MINT3 and MINT9 mapped to chromosome lp35-36, 
MINT 13 mapped to 7q31, MINT24 mapped to 3p25-26, MINT25 mapped to 
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22ql 1-Ter, and MINT31 mapped to 17q2L All of these chromosomal 
segments are areas that are frequently deleted in various tumors. 

An important application of MCA is in the discovery of novel genes 
5 hypermethylated in cancer. As demonstrated here, MCA coupled with RDA is 
a rapid and powerful technology for this purpose, and compares favorably 
with other described techniques (Hayashizaki et at , 1 994, Gonzalgo et ah , 
1997; Huang et ai 9 1997). In addition to the identification of genes 
hypermethylated in cancer, MCA could potentially be used to discover novel 
10 imprinted genes using partheno genetic DNA (Kaneko-Ishino et at, 1995), as 
well as novel X-chromosome genes. 

EXAMPLE 3 

15 SILENCING OF THE VERSTCAN GENE IN CRC 

To determine whether some of these clones truly represented genes 
silenced by methylation, we examined the versican gene in more detail. 
Versican is a secreted glycoprotein that appears to be regulated by the Rb 
tumor suppressor gene (Rohde et aL, 1996). MINT1 1 corresponds to part of 

20 exon 1 and part of intron 1 of the versican gene (Figure 3 A). Hypermethyla- 
tion of the two Smal sites in exon 1 and intron 1 in colon cancer cell lines was 
confirmed by both Southern blot analysis and MCA. In order to determine if 
this methylation was representative of the entire CGI, including the proximal 
promoter, PCR was performed on bisulfite-treated DNA using primers 

25 designed to amplify the region around the transcription start site of this gene. 
The PCR product was then digested with restriction enzymes that distinguish 
methylated from unmethylated DNA. The versican promoter was found to be 
completely methylated in the colon cancer cell lines, DLD1, LOVO, SW48 
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and SW837, and partially methylated in HCT1 16 and HT29 (Figure 3B). In 
primary colon tumors, versican was hypermethylated in 1 7 out of 25 cases 
(68%). Interestingly, some methylation of the versican promoter was also 
found in normal tissues, albeit at lower levels when compared to tumors. The 
5 level of methylation in normal colon mucosa increased with age of the patient 
(Fig3C), from an average of 6.9% in patients between 20 and 30 years of age, 
to an average of 28.9% in patients over 80. A linear regression analysis 
revealed a significant association between age and versican promoter 
methylation (R=0.7, P<0.000001). Using RT-PCR, we next examined the 

1 0 expression of versican in normal colon mucosa and CRC cell lines. Versican 
was found to be expressed in normal colon epithelium, but was markedly 
down-regulated or absent in methylated colon cancer cell lines. Expression of 
versican in all these cell lines was easily restored after treatment with the 
demethylating agent, 5-aza-deoxycytidine. These data suggest that versican 

1 5 becomes methylated in normal colon in an age-dependent manner, and that 
this leads to hypermethylation and loss of expression in most colorectal 
tumors. 

Using MCA/RDA 33 differentially methylated clones were identified 
20 and characterized in detail. By sequencing, we found that 29 out of the 33 
clones satisfy the criteria of CpG islands, demonstrating that MCA can 
represent CGIs specifically. Of these 29 clones 5 were already known genes 
(versican, alpha-tubulin, CSX, OPT homologue and ribosomal RNA gene). 
Of these, versican is most interesting in that this proteoglycan is an Rb 
25 inducible gene (Rohde et aL, 1996), suggesting that down regulation of this 
gene product may have an important role in colorectal carcinogenesis, where 
Rb mutations are rare. The data clearly show that aberrant methylation of the 
versican gene promoter is correlated with silencing of this gene. In addition, 



WO 00/26401 



PCT/US99/25251 - 



-48- 

methylation of the alpha-tubulin gene in Caco2 is consistent with the results of 
studying the gene expression profile of colorectal cancers using SAGE (Zang 
et aL, 1997), which demonstrated that alpha-tubulin is markedly 
down-regulated in CRC. Methylation of the CSX and OPT genes does not 
5 coincide with their 5' end, and is therefore not expected to silence these genes. 
It is possible, however, that these CpG islands are associated with alternate 
transcripts of the genes, or with other nearby genes, which would then be 
silenced by methylation (Wutz, A., et aL, 1997). Finally, methylation of 
ribosomal genes has previously been seen in aging tissues (Swisshelm, K., et 

10 a/., 1990) and therefore is not surprising to find in cancers. Because some of 
the clones recovered are in the exon 1 region of expressed genes, identification 
of new tumor suppressor genes might be facilitated by using MCA/RDA 
clones as probes for screening cDNA library. Indeed, based on their 
chromosome location, several clones map to chromosomal regions thought to 

15 harbor TSGs because they are highly deleted in various tumors (e.g., 
chromosome lp35, 3p25-26, 7q31, 17q21 and 22qll-Ter). 

EXAMPLE 4 

TWO TYPES OF M FTHVLATTON TN CRC 

20 By examining the methylation status of several known genes in 

colorectal tumors, it has been previously demonstrated that some genes tend to 
be methylated in an age-dependent manner in normal colon (Issa et aL, 1994), 
and are frequently methylated in CRC, while others are methylated in cancers 
exclusively (Ahuja et aL, 1997). To examine this issue on a genome wide 

25 level in some detail, the methylation profile of 3 1 MINT clones in a panel of 
colorectal tumors and corresponding normal colon mucosa was examined 
using MCA (two clones could not be accurately studied because of high 
background (MINT29) or small size (MINT33)). Because all of theses clones 
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were recovered from a CRC cell line, there was an initial concern that many of 
these were not representative of methylation in primary (uncultured) tumors. 
However, of the 3 1 clones, 29 were also found to be methylated in some 
primary CRC. The two clones methylated only in the cell line Caco2 were (1) 
5 MINT 14, a LINE element, and (2) MINT1 8, a sequence that had a very low 
CpG frequency and did not qualify as a CGI. Thus, all non-repetitive CGIs 
recovered were methylated in primary CRC as well as cell lines. 
Hypermethylation patterns of these 29 clones fell into two distinct categories. 
A majority of the clones (22 out of 29) were found to be frequently methylated 

1 0 (>70%) in the tumors tested, and a slight amount of methylation was also 
detected in normal colon mucosa. For all of these clones, the normal colon 
mucosa obtained from young patients showed less methylation compared to 
the normal mucosa from older patients (Figure 4B). Thus, the majority of 
CGIs hypermethylated in CRC are methylated in normal colon mucosa as 

1 5 well, in an age related manner. This methylation was named Type A for 
aging-specific methylation. 

The remaining 7 clones were methylated exclusively in CRC, and their 
frequency of methylation was significantly lower than type A methylation 
20 (ranging from 10% to 50%). This type of methylation was named type C for 
cancer-specific. 
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Recently, several reports have suggested that aberrant methylation of 
CGIs may play an important role in cancer development (Baylin et al, 1998; 
Jones 1997). However, there is little integrated information on aberrant CGI 
methylation in cancer at multiple loci, probably because of the lack of a 
5 method to detect methylation in a large number of samples for unselected 
CGIs throughout the genome. Furthermore, it has been shown that cultured 
cell lines have a high degree of CGI methylation (Antequera et al , 1 990) but it 
was not known to what extent this reflects methylation in primary cancers. To 
address these issues, the relatively quantitative and high output features of 
1 0 MCA allowed us to determine the methylation profile of 3 1 differentially 
methylated loci in a panel of colorectal carcinomas. 

Despite the fact that all sequences were initially recovered from a 
colon cancer cell line, only 2 out of the 3 1 clones showed cell line restricted 

1 5 methylation. From the sequence data, one of these two clones was a repeated 
sequences (LINE1), and the other was not a CGI. Thus most of the single 
copy clones recovered proved to be methylated not only in cell lines but also 
in some primary colon cancers. Analysis of these 29 clones revealed two 
distinct types of hypermethylation in cancer (Type A for aging and Type C for 

20 cancer), which may have distinct causes, and different roles in cancer 

development. Type A methylation was seen in the majority of these clones: 
22 of 29 (74%) clones were methylated in an age-related manner in normal 
colon tissue, and hypermethylated at a high frequency in CRC, as we have 
shown for the ER gene (Issa et al, 1994) and others (Issa et al., 1996; Ahuja et 

25 al, submitted). These results suggest that a large number of CGIs in the 

human genome are incrementally methylated during the aging process and, for 
many genes, this methylation correlates with reduced gene expression as 
shown for ER (Issa et al , 1 994) and versican. Although the mechanism of 
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Type A methylation is unknown, it is likely to result from physiological 
processes rather than a genetic alteration because ( 1 ) it is very frequent and 
affects large numbers of cells, (2) it is present in all individuals, not just 
patients with cancer and (3) this process is gene and tissue specific (Ahuja 
5 et aL, submitted). Because the methylation status at a given CGI is thought to 
be related to positive (methylator) factors (Mummaneni et aL, 1993; 
Mummaneni et aL, 1995; Magewu and Jones, 1994, Vertino et aL, 1996) and 
negative (protector) factors (Macleod et aL, 1994; Brandeis et aL, 1994; 
Turker and Bestor, 1997; Chen et aL, 1997), it is possible that for some genes, 
1 0 this balance favors slightly de-novo methylation, and that this is reflected by 
progressive hypermethylation after repeated cell divisions. 

EXAMPLE 5 

GLOBAL HYPERME TRYT ATTON IN CRC 

15 

To understand the patterns of cancer-specific methylation in CRC, the 
methylation status of all 7 type C clones was analyzed, as well as pi 6 in 
primary cancers and polyps (Figure 4). Two of these clones (MINT1 and 
MINT2) were studied by both MCA and bisulfite-PCR, and the concordance 
20 between the two techniques was found to be 98%. PI 6 was studied by both 
MCA and Southern blot, with a concordance rate of 98%. When we 
considered the six clones that were methylated in more than 1 0% of the cases, 
as well as pi 6, a remarkable pattern emerged (summarized in Figure 5 and 
Table 2). 
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The 50 CRC fell into two distinct groups: (1) A group with a high level 
of Type C methylation, whereby all the tumors had methylation of 4 or more 
loci simultaneously and (2) a group where methylation of any type C clone is 
5 extremely rare. Thus, the first group of tumors appears to display profound 
global hypermethylation (GH+), which is lacking in the second group (GH-). 
Interestingly, there was a great concordance between methylation of the pi 6 
gene, which was not selected for by our cloning process, and the presence of 
GH. In sharp contrast, Type A methylation was not significantly different 
1 0 between GH+ and GH- tumors (Table 2). 

GH was also detected in a subset of colorectal adenomas (Figure 5), 
suggesting that it is an early event in carcinogenesis. Interestingly, while 5 of 
5 small adenomas (<7 mm) were GH-, 6 of 9 large adenomas (>10 mm) were 

1 5 GH+, suggesting that this defect may be acquired in the transition between 

small and large adenomas. In 6 cases, both an adenoma and a cancer from the 
same patients were examined. In one of these, GH was detected both in the 
adenoma and the cancer; in 3 cases, GH was detected in the cancer but not in 
the adenoma and in 2 cases, GH was detected in neither the adenoma nor the 

20 cancer. 

By contrast to type A methylation, type C methylation is relatively 
infrequent in primary CRC, and is never observed in normal colon mucosa. 
Furthermore, detailed analysis of type C methylation in CRC revealed a 
25 striking pattern, suggesting the presence of global hypermethylation in a 

subset of these tumors: GH positive cases are characterized by frequent and 
concordant methylation of all type C clones examined, such that each tumor 
has at least four methylation events. By contrast, type C methylation is 
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virtually non-existent in tumors without GH. This concordance cannot be due 
to simple experimental variation or artifacts because (1) methylation was 
verified using separate methods (MCA, bisulfite-PCR and Southern blots), (2) 
the concordance was not limited to MCA/RDA derived clones since it also 
5 affected the pl6 (Herman et al, 1995) and hMLHl (Kane et al, 1997) genes, 
and (3) there was no significant difference in type A methylation between 
GH+ and GH- tumors. Global hypermethylation appears to be an early event 
in the development of CRC, being detectable in large pre-neoplastic 
adenomas. Because many genes are potential candidates for inactivation 

1 0 through promoter methylation (Baylin et al , 1 998; Jones, 1 996), global 
hypermethylation may have profound pathophysiologic consequences in 
neoplasia through the simultaneous inactivation of tumor-suppressor genes 
(such as pi 6), metastasis-suppressor genes (such as E-cadherin), angiogenesis 
inhibitors (such as Thrombospondin-1) and others. In fact, our data suggest 

1 5 that global hypermethylation could also result in mismatch repair deficiency 
through methylation and inactivation of the hMLHl promoter, and may 
explain up to 75% of cases of sporadic CRC with microsatellite instability. 
The causes of type A and type C methylation are probably different because 
the latter is detected only in a limited number of cases, and the genes affected 

20 are different. Because of the remarkable concordance in type C methylation 
among GH+ cases, it appears likely that these tumors all share a specific 
defect in the maintenance of the methylation-free state in CGIs. This defect 
could be either aberrant de-novo methylation (through a mutation in 
DNA-methyltransferase for example), or loss of protection against de-novo 

25 methylation, through the loss of a trans-activating factor (Macleod et al , 1 994; 
Chen et al, 1997). Because DNA-methyltransferase activity is similar in the 
two groups, the latter hypothesis is more likely. Thus, at least in colorectal 
cancer, it appears likely that type C methylation (an epigenetic error) is 
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actually caused by a genetic event that results in an increased chance of 
methylating a subset of CGIs. Ironically, this epigenetic defect may then 
result in additional genetic lesions through the induction of mismatch-repair 
deficiency. 

5 

E XAMPLE 6 

MICRO S ATELLI TE INSTABI L ITY IS LINKED T Q 
GLOBAL HYPERMETHYLATION IN CRC 

10 In a previous study (Ahuja et aL, 1997), a link was reported between 

microsatellite instability and a hypermethylator phenotype in sporadic CRC. 
Relatively few mutations in mismatch repair genes have been reported in 
sporadic MI+ cancers, but hMLHl methylation has recently been observed in 
some cases (Kane et aL, 1997). To determine the relation between global 

1 5 hypermethylation and microsatellite instability in CRC, we measured hMLHl 
methylation using bisulfite/PCR in our panel of CRC which had also been 
previously typed for the presence of microsatellite instability (Figure 5). 
hMLHl was studied by bisulfite-PCR only because it does not have 2 Smal 
sites in its CGI. Overall, 16 out of 50 (32%) cancers had evidence of 

20 microsatellite instability. Among the 29 GH+ cases, 12 had evidence of 
hMLHl methylation, suggesting that hMLHl is one of the targets of global 
hypermethylation in CRC. All of these 12 tumors had microsatellite 
instability. By contrast, hMLHl methylation was detected in only one of the 
21 GH- cases. These data establish a strong link between the GH phenotype, 

25 hMLHl methylation and microsatellite instability in CRC. Two lines of 

evidence suggest that microsatellite instability may follow, and be caused by, 
global hypermethylation and hMLHl methylation. First, GH is detectable in 
about half of colonic adenomas, but none of these tumors have hMLHl 
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methylation, and microsatellite instability is extremely rare in this pre- 
neoplastic lesion (Samowitz and Slattery, 1997). Second, GH is not simply 
caused by mismatch repair defects because microsatellite instability is absent 
in more than half of the GH+ cases, and GH was absent in 4 of the 1 6 cancers 
5 with microsatellite instability. Overall, our data suggest that, in sporadic 
CRC, the majority (12 out of 16, or 75%) of cases with microsatellite 
instability may be caused by GH followed by hMLHl methylation, loss of 
hMLHl expression and resultant mismatch repair deficiency (Herman et ah , 
submitted). 

10 

Based on these data, the following model has been developed 
integrating CGI methylation into CRC development (Figure 6). In this model, 
CGI methylation plays two distinct roles, and appears to arise through distinct 
mechanisms. Initially, type A methylation arises as a function of age in 

15 normal colorectal epithelial cells. By affecting genes that regulate the growth 
and/or differentiation of these cells, such methylation results in a 
hyperproliferative state, which is thought to precede tumor formation in the 
colon (reviewed by Lipkin, 1988). Such hyperproliferation is known to arise 
with age in colorectal epithelium (Holt et ah, 1988, Roncucci et ah, 1988), and 

20 to be marked in patients with CRC. The cause of type A methylation is 

unknown, but without being bound by theory it is possible that it is related to 
endogenous factors inherent to the structure of DNA, and that it may be 
modulated by factors such as level of ongoing expression and exposure to 
carcinogenic insults. Furthermore, modulation of type A methylation may 

25 provide one possible explanation for the reduction in CRC tumorigenesis by 
reducing levels of DNA-methyltransferase (Laird et ah, 1995). 
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A second major role for CGI methylation appears later, perhaps at the 
transition between small and large adenomas in the colon. This methylation 
(type C) affects only a subset of tumors, which then evolve along a pathway of 
global hypermethylation. This GH leads to cancer development through the 
5 simultaneous inactivation of multiple tumor-suppressor genes such as pi 6, and 
induction of mismatch repair deficiency through inactivation of hMLHl . The 
cause of this global hypermethylation is unknown, but may well be related to 
inactivation of a gene that protects CGIs from de-novo methylation. Finally, 
we propose that tumors without GH evolve along more classic genetic 
1 0 instability pathways, including chromosomal instability (Lengauer et al. , 
1997 A). Interestingly in this regard, Lengauer et al found an inverse 
correlation between chromosomal instability and MMR deficiency in CRC 
cell lines (Lengauer et aL 9 1997B). 

15 While based on CRC, this model is applicable to most human 

malignancies. In evidence has also been found for type A and type C 
methylation in brain tumors (Li et al, 1998). Preliminary evidence also 
suggests the presence of global hypermethylation in multiple types of cancers, 
including stomach cancers, brain tumors and hematopoietic malignancies. 

20 

In conclusion, a novel method, MCA, has been developed to 
selectively amplify methylated CGIs. Using MCA/RDA 33 differentially 
methylated clones in CRC were isolated. The methylation profile of these 
clones revealed that nearly all methylation in CRC can be accounted for by ( 1 ) 
25 age-related methylation and (2) a hypermethylator phenotype presumably 

caused by global hypermethylation. Deciphering the mechanisms underlying 
these phenomena should facilitate the early detection, prevention and therapy 
of cancers, including colorectal cancers. 
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EXAMPkE 7 

IDENTIFICATION OF CACNA1G AS A TARGET FOR 
HYPERMETHYLATION ON HUMAN CHROMOSOME 17a21 

5 

To identify genes differentially methylated in colorectal cancer, 
methylated CpG island amplification was used followed by representational 
difference analysis (Razin and Cedar, Cell 17: 473-476, 1994, herein 
incorporated by reference). One of the clones recovered (MINT3 1, see above) 

1 0 mapped to human chromosome 1 7q2 1 using a radiation hybrid panel, and a 

Blast search revealed this fragment to be completely identical to part of a BAC 
clone (Genbank: AC004590) sequenced by high throughput genomic 
sequence. The region surrounding MINT3 1 fulfills the criteria of a CpG 
island: GC content 0.67, CpG/GpC ratio 0.78 and a total of 305 CpG sites in a 

15 4 kb region. Using this CpG island and 1 0 kb of flanking sequences in a Blast 
analysis, several regions highly homologous to the rat T-type calcium channel 
gene, CACNA1G, were identified (Perez-Reyes et aL, Nature 221: 896-900. 
1998, herein incorporated by reference). Several ESTs were also identified in 
this region. Using Genscan, 2 putative coding sequences (Gl, and G2) were 

20 identified. Blastp analysis revealed that Gl has a high homology to the EH- 
domain-binding protein, epsin, while G2 is homologous to a C-elegans 
hypothetical protein (accession No. 2496828). 

The MINT31 CpG island corresponds to the 3' regions of Gl and G2, 
25 based on the direction of the open reading frame and the presence of a poly A 
tail, and is unlikely to influence their transcription. The EST closest to 
MINT31 (HI 3333) was sequenced entirely and was found not to contain a 
continuous open reading frame, but a poly-adenylation signal was identified 
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on one end, along with a poly A tail. These data suggest that HI 3333 
corresponds to the last 2 exons of an unidentified gene. MINT3 1 is in the 
intron of this gene and is unlikely to influence its transcription. However, 
based on both promoter prediction (TSSG) analysis of this region and 
5 homology to the rat CACNA1G sequence, the MINT3 1 CpG island is also in 
the 5 ! region of human CACNA1G gene and may play a role in its 
transcriptional activity. 

The human CACNA1G sequence deposited in Genbank lacks the 5 f 

1 0 region of the gene, when compared to the rat homologue. To determine the 5 f 
region of human CACNA1G, we amplified cDNA by RT-PCR using primers 
based on the BAC sequence (Genbank: AC004590, herein incorporated by 
reference). The PCR products were cloned and sequenced, and the genomic 
organization of the gene was determined by comparing the newly identified 

1 5 sequences as well as the known sequences to the BAC that covers this region. 
CACNA1G is composed of 34 exons which span a 70 kb area. Based on 
sequences deposited in Genbank, the gene has two possible 3* ends caused by 
alternate splicing. CACNA1G is highly homologous to rat CACNA1G with 
93% identity at the protein level, and 89% identity at the nucleotide level. The 

20 5' flanking region of CACNA1G lacks TATA and CAAT boxes, which is 
similar to many housekeeping genes. A putative TFIID binding site was 
identified 547-556 bp upstream from the translation start site, and several 
other potential transcription factor binding sites such as API (1 site), AP2 (2 
sites) and SP1 (10 sites), were identified upstream of CACNA1G exon 1 using 

25 the promoter prediction program, TESS (data not shown). 

The CACNA1G CpG island is 4 kb, and is larger than many typical 
CpG islands. MINT31 corresponds to the 5 f edge of the island while 
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CACNA1G is in the 3' region. It is not known whether large CpG islands such 
as this are coordinately regulated with regards to protection from methylation, 
and aberrant methylation in cancer. To address this issue, the methylation 
status of the 5' region of CACNA1G was studied using bisulfite-PCR of DNA 
5 from normal tissues as well as 35 human cancer cell lines from colon, lung, 
prostate, breast and hematopoietic tumors. The CpG island was divided into 8 
regions and their methylation status was examined separately. The genomic 
DNA was treated with sodium bisulfite and PCR amplified using primers 
containing no or a minimum number of CpG sites. Methylated alleles were 
1 0 detected by digesting the PCR products using restriction enzymes which 

specifically cleave sites created or retained due to the presence of methylated 
CpGs. None of the regions was methylated in normal colon, consistent with a 
uniform protection against de-novo methylation. 

1 5 Regions 1 and 2 were frequently methylated in cancer cell lines, and 

behaved in a concordant manner. These 2 regions were methylated in most 
cancer cell types except gliomas, and most cell lines where methylation was 
found methylated both regions simultaneously. Region 3 , which is less CG 
rich than any of the other regions, had either no methylation or very low levels 

20 of methylation in most cell lines. Regions 5, 6, and 7 behaved quite 

differently compared to 1-3. Methylation of these regions was less frequent 
than regions 1-2, as 22/35 cell lines had no detectable methylation there, 
despite often showing methylation of region 1 -2. However, when methylation 
was present (in 13/35 cell lines), it affected all 3 regions simultaneously, 

25 although to varying extents. Finally, regions 4 and 8 behaved differentially 
again, being partially methylated primarily in colon and breast cell lines. 
Therefore, with regards to hypermethylation in cancer, the CpG rich region 
upstream of CACNA1G appears to be composed of 2 CpG islands which 
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behave independently. MINT3 1 corresponds to the upstream CpG island 
(island 1, regions 1 and 2), while the 5* region of CACNA1G is contained in 
the downstream CpG island (island 2, regions 5-7). Regions 3, 4 and 8 
correspond to the edge of these CpG islands, and behave a little differentially 
5 than the hearts of the CpG islands, as previously described for the E-Cad gene 
(Graff, et aL, J. Biol Chem. 212: 22322-22329, 1997). 

Overall, the methylation patterns fell into 5 distinct categories: (1) No 
methylation in any region (normal tissue). (2) Slight methylation of island 1 

10 (6 cell lines, see for example TSU-PRL in Fig. 2). (3) Heavy methylation of 
island 1 but no methylation of island 2(16 cell lines, see for example Caco2 in 
Fig. 2). (4) Heavy methylation of island 1 and moderate to heavy methylation 
of island 2 (6 cell lines, see for example RKO and Raji in Fig. 2). (5) High 
methylation of island 1 and low to moderate methylation of island 2 (7 cell 

1 5 lines, see for example MB-23 1 in Fig. 2). 

In a previous study of rat CACNA1G, this gene was shown to be 
expressed most abundantly in the brain (Perez-Reyes et aL, Nature 391 : 896- 
900. 1998). To determine the expression of CACNA1G in normal and 

20 neoplastic human cells, RT-PCR was performed using cDNA from various 
normal tissues and from a panel of 27 tumor cell lines. CACNA1G was 
expressed ubiquitously in a variety of tissues and cell lines. In normal tissues 
expression was relatively low but easily detectable, while most cell lines had 
relatively high expression of CACNA1G. However, some cell lines had 

25 negligible or totally absent levels of CACNA1G expression. The results of 
CACNA1G expression was correlated with the detailed methylation analysis 
previously described. In this analysis, a remarkable pattern emerged. 
Methylation of region 1-4 and 8 had no effect on CACNA1G expression. 
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However, there was a strong correlation between methylation of regions 5-7 
and expression of the gene. In fact, all cell lines that lack methylation of this 
region strongly express the gene. All 6 cell lines with pattern 4 methylation 
studied had no detectable expression. Finally, the 7 cell lines with pattern 5 
5 methylation (examples DLD-1 and MB-453) had variable levels of expression 
ranging from very low to near normal. The fact that patterns 3 and 5 differ 
significantly with regards to expression, but are almost identical with regards 
to methylation of all regions except 7 suggests that this area is important in the 
inactivation of CACNA1G. 

10 

To confirm whether methylation of the 5 f CpG island of CACNA1G is 
really associated with gene inactivation, 3 non-expressing cell lines showing 
pattern 4 methylation (RKO, SW48 and Raji) and 2 weakly expressing cell 
lines showing pattern 5 methylation (MB-231 and MB-435) were treated with 
15 1 M of the methyl-transferase inhibitor 5-deoxy-azacitidine. After treatment, 
all these cell lines re-expressed CACNA1G mRNA. Consistent with re- 
expression, demethylation of region 7 was observed after 5-deoxy-azacitidine 
treatment (Fig. 3C). 

20 De novo cytosine methylation is thought to sometimes occur in vitro 

during cell propagation (Antequera et al., Cell Q2\ 503-514, 1990). To 
determine whether the methylation of CACNA1G occurs in vivo, primary 
human tumors were examined for methylation of the 5' region of CACNA1G. 
Aberrant methylation was detected in 17 out of 49 (35%) colorectal cancers, 4 

25 out of 28 colorectal adenomas (25%), 4 out of 16 (25%) gastric cancers and 3 
out of 17 (18%) acute myelogenous leukemia cases. In colorectal cancers, 
there was a significant correlation between methylation of CACNA1G and 
methylation of pi 6 (p<0.005) and hMLHl (pO.OOl), as well as a strong 
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correlation with the presence of microsatellite instability, and the recently 
identified CpG island methylator phenotype (CIMP), supporting that 
CACNA1G is also a target for CIMP in colorectal cancer. 

To determine whether aberrant methylation of the 5 ! region of 
CACNA1G affects the expression status of this gene in primary tumors, we 
performed RT-PCR using cDNA from a series of colorectal adenomas. Six 
out of 8 cases which showed no methylation of region 7 expressed CACNA1G. 
In sharp contrast, all 5 cases that showed methylation of region 7 had no 
detectable expression of this gene. 

Thus, a human T-type calcium channel gene (CACNAIG) has been 
identified and cloned using the MINT3 1 sequence as a probe. The human T- 
type calcium channel gene has been determined to be a target of aberrant 
methylation and silencing in human tumors. The data show that MINT3 1 (a 
representative sequence of MINT 1-33) can be used as a probe to identify 
genes that play a role in disorders such as cell proliferative disorders. 

Detailed analysis of the CpG island upstream of CACNAIG revealed 
that methylation 300 to 800 bp upstream of the gene closely correlated with 
transcriptional inactivation. The CACNAIG promoter is contained in a large 
CG rich area that is not coordinately methylated in cancer. The CpG island 
around MINT3 1 is much more frequently methylated in cancers compared to 
that just upstream of CACNAIG. This may simply be caused by differential 
susceptibility to de-novo methylation between these two regions, with 
methylation of MINT3 1 serving as a trigger, and eventually spreading to 
CACNAIG, as described in other genes (Graff, et al., J. Biol Chem, 272 : 
22322-22329, 1997). However, it is likely that these 2 regions are controlled 
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by different mechanisms because (1) cell lines kept in culture for countless 
generations do not in fact spread methylation from MINT31 to CACNA1G 
(e.g., Caco2), (2) region 3 that separates the 2 islands is infrequently and 
sparsely methylated in cancer and (3)2 cases of primary colorectal cancer 
5 were found which are methylated at the CACNA1G promoter but not at 

MINT3 1). Therefore, methylation of MINT3 1 appears to be independent of 
methylation of CACNA1G suggesting that they are 2 distinct CpG islands 
regulated by different mechanisms. These data leave open the possibility that 
MINT3 1 is the promoter for an unidentified gene, which may perhaps be 
10 transcribed opposite to CACNA1G. 



Many CpG islands of silenced genes appear to be methylated 
uniformly and heavily throughout the island (e.g., Graff, et al., J. Biol. Chem. 
272 : 22322-22329, 1997). In contrast the methylation patterns of the 5 f region 
15 of CACNA1G (region 5-7) was heterogeneous in the cell lines which did not 
express this gene. Nevertheless, methylation does appear to play a role in 
CACNA1G repression since demethylation readily reactivates the gene. 



20 The causes of CACNA1G methylation remain to be determined. 

Methylation was not detected in normal colon mucosa, placenta, normal breast 
epithelium and normal bone marrow, including samples from aged patients, 
suggesting that methylation of this region is cancer specific. However, there 
was a significant correlation between methylation of CACNA1G and other 

25 tumor suppressor genes such as pi 6 and hMLHL Thus, CACNA1G probably 
is a target for the recently described CIMP phenotype, which results in a form 
of epigenetic instability with simultaneous inactivation of multiple genes. It 
should be noted that a gene identified by the method of the invention 
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(MINT31) has been successfully utilized to identify another gene of interest 
(CACNA1G) whose methylation pattern correlates with the presence of 
specific cell proliferative disorders. 

T-type calcium channels are involved not only in electrophysiological 
rhythm generation but also in the control of cytosolic calcium during cell 
proliferation and cell death (reviewed in Berridge, et al., Nature 395 : 645- 
648, 1998). The results demonstrate that the expression of CACNA1G is not 
limited to brain and heart, suggesting that it may play a role in these other 
tissues. It has previously been shown that Ca influx via T-type channels is 
an important factor during the initial stages of cell death such as apoptosis 
(Berridge, et al., Nature 221: 645-648, 1998), ischemia (Fern, J. NeuroscL 18: 
7232-7243, 1998) and complement-induced cytotoxicity (Newsholme, et al., 
Biochem. J. 221: 773-779, 1993.). These studies determining the methylation 
status of the CACNAIG suggest that the impairment of voltage gated calcium 
channels may play an important role in cancer development and progression 
through altering calcium signaling. 
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E XAMPLE 3 
EXPERIMENTAL PROCEDURES 

Methylated CpG Island Amplification. 

The procedure is outlined in Figure 1. Five \xg of DNA were digested 
5 with 100 units of Smal for 6 hours (all restriction enzymes were from NEB). 
The DNA was then digested with 20 units of Xmal for 16 hours. DNA 
fragments were then precipitated with ethanol. RXMA and RMCA PGR 
adaptors were prepared by incubation of the oligonucleotides RXMA24 
(5'-AGCACTCTCCAGCCTCTCACCGAC-3') (SEQ ID NO: 34) and 

10 RXMA12 (5 ' -CCGGGTCGGTGA-3 ' ) (SEQ ID NO:35), or RMCA24 
(5'-CCACCGCCATCCGAGCCTTTCTGC-3') (SEQ ID NO:36) and 
RMCA 12 (5 ' -CCGGGC AGAAAG-3 ' ) (SEQ ID NO:37) at 657C for two min. 
followed by cooling to room temperature. 0.5 |ig of DNA was ligated to 
0.5 nmol of RXMA or RMCA adaptor using T4 DNA ligase (NEB). PCR was 

15 performed using 3 \x\ of each of the ligation mix as a template in a 100 |il 
volume containing 100 pmol of RXA24 or RMC24 primer, 5 units of Taq 
DNA polymerase, (GIBCO-BRL.), 4 mM MgC12, 16 mM of NH4 (S0 4 )2, 
lOmg/ml of BSA, and 5% v/v DMSO. The reaction mixture was incubated at 
727C at 5 min and at 957C for 3 min. Samples were then subjected to 25 

20 cycles of amplification consisting of 1 min at 957C, and 3 min either at 727C 
or 777C in a thermal cycler (Hybaid, Inc.). The final extension time was 10 
min. 

Detection of Aberrant Methylation Using MCA. 
25 MCA products from normal colon mucosa and corresponding cancer 

tissues were prepared as described above. One |ag of MCA products was 
resuspended in 4 \A of TE (10 mM Tris pH 8.0, 1 mM EDTA pH 8.0), mixed 
with 2 \xl of 20 X SSC, and 1 \xl aliquot of this mix was blotted onto nylon 
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membranes (Nunc) using a 96 well replication system (Nunc). The 
membranes were baked at 807C, UV crosslinked for 2 min. and hybridized 
using 32 P labeled probes. Each sample was blotted in duplicate. Each filter 
included mixtures of a positive control (Caco2) and a negative control (normal 
5 colon mucosa from an 1 8 year old individual). The filters were exposed to a 
phosphor screen for 24 to 72 hours and developed using a phosphorimager 
(Molecular Dynamics). The intensity of each signal was calculated using the 
Image Quant software, and methylation levels were determined relative to the 
control samples. 

10 

RDA. 

RDA was performed essentially as previously reported (Lisitsyn et al. , 
1993) with the following modifications. For the first and second rounds of 
competitive hybridization, 500 ng and 100 ng of ligation mix was used, 
1 5 respectively. To eliminate the digested adaptor, a cDNA spun column 

(Amersham) was used instead of excising from the agarose gel. Primers used 
for the first and second rounds of RDA are as follows : 



JXMA24 


5 '-ACCGACGTCGACTATCCATGAACC-3 ' 


SEQ ID NO:38 


JXMA12 


5 ' -CCGGGGTTC ATG-3 ' 


SEQ ID NO:39 


JMCA24 


5 '-GTGAGGGTCGGATCTGGCTGGCTC-3' 


SEQ ID NO:40 


JMCA12 


5 '-CCGGGAGCCAGC-3 ' 


SEQ ID NO:41 


NXMA24 


5 ' - AGGC AACTGTGCTATCCG AGTGAC-3 ' 


SEQ ID NO:42 


NXMA12 


5 '-CCGGGTCACTCG-3 ' 


SEQ ID NO:43 


NMCA24 


5 ' -GTTAGCGGAC AC AGGGCGGGTC AC-3 ' 


SEQ ID NO:44 


NMCA12 


5'-CCGGGTGACCCG-3' 


SEQ ID NO:45 
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After the second round of competitive hybridization, PCR products 
were digested with Xmal. The J adaptor was eliminated by column filtration. 
The PCR products were then subcloned into Bluescript SK(-) (Stratagene). To 
screen for inserts, a total of 396 clones were cultured overnight in LB medium 
5 with ampicillin and 3 \il of the culture was directly used as template for a PCR 
reaction. Each clone was amplified with 

T3 (5 ' -AATTAACCCTC ACTAAAGGG-3 ' ) (SEQ ID NO:46) 
and 

T7 (5'-GTAATACGACTCACTATAGGGC-3 ') (SEQ ID NO:47) 
1 0 primers, 

blotted onto nylon membranes, and screened for cross hybridization with 32P 
labeled inserts. The clones differentially hybridizing to tester and driver MCA 
products were further characterized by Southern blot analysis and DNA 
sequencing. 

15 

Southern blot analysis. 

Five |ig of DNA was digested with 20-100 units of restriction enzymes 
as specified by the manufacturer (NEB). DNA fragments were separated by 
agarose gel electrophoresis and transferred to a nylon membrane (Zeta-probe, 
20 Bio-Rad). Filters were hybridized with 32P-labeled probes and washed at 

657C with 2X SSC, 0.1 % SDS for 10 min. twice, and 0.1X SSC, 0.1 % SDS 
for 20 min. Filters were then exposed to a phosphor screen for 24-72 hours 
and analyzed by using a phosphorimager (Molecular Dynamics). 

25 DNA sequencing and analysis. 

Plasmid DNA was prepared using the Wizard Plus Minipreps 
(Promega) according to the suppliers recommendation. Sequence analysis was 
carried out at the Johns Hopkins Core Sequencing Facility using automated 
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DNA sequencers (Applied Biosystems). Sequence homologies were identified 
using the BLAST program of the National Center for Biotechnology 
Information (NCBI) available at http://www.ncbi.nlm.nih.gov/BLAST using 
the default parmaters of the web site. Putative promoter sequences were 
predicted using the computer programs NNPP and TSSG available through the 
Baylor college of Medicine launcher at http://dot.imgen.bcm.tmc.edu:9331. 



Bisulfite-restriction methylation analysis. 

DNA from colon tumors, cell lines and normal colon mucosa was 
treated with bisulfite as reported previously (Herman et al, 1996). Primers 
used for PCR were as follows: 



hMLHl, 



MINT1, 



MINT2, 



Versican, 



5 ' -T AGT AGT YGTTTTAGGG AGGGA-3 ' (SEQ ID 
NO:44), 

5 ' -TCTAAATACTC AACRAAAATACCTT-3 ' (SEQ 
ID NO:45); 

5 ' -GGGTTGG AGAGT AGGGGAGTT-3 ' (SEQ ID 
NO:46), 

5 ' -CC ATCTAAAATTACCTCRATAACTTA-3 ' (SEQ 
ID NO:47); 

5 '- YGTTATGATTTTGTTTAGTTAAT-3 ' (SEQ ID 
NO:48), 5 ' -TACACCAACTACCCAACTACCTC-3 ' 
(SEQ ID NO:409); 

5 '-TTATTAYGTTTTTTATGTGATT-3 ' (VI) (SEQ 
ID NO:50), 5 ' - ACCTTCTACC AATTACTTCTTT-3 ' 
(V2) (SEQIDNO:51). 



Ten to 20 pi of the amplified products were digested with restriction 
enzymes which distinguish methylated from unmethylated sequences as 
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reported previously (Sadri et ah, 1996; Xiong et aL 9 1997), electrophoresed on 
3 % agarose or 5% acrylamide gels, and visualized by ethidium bromide 
staining. 

RT-PCR 

Total RNA was prepared from normal colon epithelium and tumor cell 
lines using TRIZOL (GIBCO-BRL). To study gene expression following 
demethylation, cell lines were treated with 1 M of 5-aza-2y-deoxycytidine for 
2-5 days. cDNA was prepared using random hexamers and reverse 
transcriptase as specified by the manufacturer (Boehringer). The expression 
of versican was determined by RT-PCR using the primers 

VF 5 ' -GCTGCCT ATGAAG ATGGATTTGAGC-3 ' (SEQ ID 
NO:52) and 

VR 5'-GGAGTTCCCCCACTGT-TGCCA-3' (SEQ ID NO:53). 

The PCR products were visualized by ethidium bromide staining. The 
cDNA samples were also amplified using GAPDH gene, primers 

GAPF 5 ' -CGGAGTC AACGG ATTGGTCGTAT-3 ' (SEQ ID 

NO:54) and 

G APR 5 ' - AGCCTTCTCC ATGGTGGTG AAGAC-3 9 (SEQ ID 

NO:55) 

as a control for RNA integrity. All reactions were performed using RT (-) 
controls where the reverse transcriptase enzyme was omitted. 

Chromosomal mapping. 

The chromosomal location of clones that did not correspond to known 
genes was determined using a human-rodent somatic cell hybrid panel and a 
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radiation hybrid panel (Research Genetics). PCR reactions were performed 
using 30 ng of each of the hybrid panel DNA as a template in a 40 jil volume 
containing 15 pmol of each primer, 0.5 units of Taq DNA polymerase, 
(GIBCO BRL), 2mM MgC12, BSA and 5% DMSO. First denaturation was 
carried out at 957C for 3 min. Samples were then subjected to 35 cycles of 
amplification consisting of 25 sec. at 947C, 1 min at 60 to 687C and 1.5 min. 
at 72°/C in a thermal cycler (Hybaid). The final extension time was 10 min. 
Ten ^1 of the PCR product were electrophoresed in a 2 % agarose and the 
genotype of each panel was determined. Linkage analysis was performed 
using the RH server of Stanford University as described (Stewart et ai, 1997). 

Although the invention has been described with reference to the 
presently preferred embodiment, it should be understood that various 
modifications can be made without departing from the spirit of the invention. 
Accordingly, the invention is limited only by the following claims. 
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What is claimed is: 

1 . A method for identifying a methylated CpG-containing nucleic acid, 
comprising 

a) contacting a nucleic acid sample suspected of 
containing a CpG-containing nucleic acid with a 
methylation sensitive restriction endonuclease, under 
conditions and for a time to allow cleavage of the 
nucleic acid; 

b) contacting the sample with an isoschizomer of said 
methylation sensitive restriction endonuclease, wherein 
said isoschizomer of said methylation sensitive 
restriction endonuclease cleaves both methylated and 
unmethylated CpG sites. 

c) adding oligonucleotides to the nucleic acid sample 
under conditions and for a time to allow ligation of the 
oligonucleotides to the nucleic acid cleaved by said 
restriction endonuclease; and 

d) amplifying said cleaved nucleic acid. 

2. The method of claim 1, wherein said methylation sensitive restriction 
endonuclease is Smal. 

3 . The method of claim 1 , wherein said amplifying is by polymerase 
chain reaction amplification. 
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4. The method of claim 3, wherein said amplifying by polymerase chain 
reaction amplification comprises annealing primers complementary to 
said oligonucleotide. 

5 . The method of claim 1 , wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:34 (RXMA24) and SEQ ID NO:35 (RXMA12). 

6. The method of claim 1, wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:36 (RMCA24) and SEQ ID NO:37 (RMCA12). 

7. The method of claim 1 ? further comprising adhering the amplified 
nucleic acid to a membrane. 



8. The method of claim 7, further comprising hybridizing the membrane 
with a probe of interest. 

9. The method of claim 1, wherein the CpG containing nucleic acid 
comprises a methylated CpG island. 

1 0. The method of claim 9, wherein the CpG island comprises a CpG 
island located in a gene selected from the group consisting of a pi 6, a 
Rb, a VHL, a hMLHl, and a BRCA1 gene. 
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1 1 . The method of claim 1 , wherein said sample is selected form the group 
consisting of a brain cell, a colon cell, a urogenital cell, a lung cell, a renal 
cell, a hematopoietic cell, a breast cell, a thymus cell, a testis cell, an ovarian 
cell, a uterine cell, an intestinal cell, serum, urine, saliva, cerebrospinal fluid, 
pleural fluid, ascites fluid, sputum, and stool. 

12. The method of claim 1, wherein the presence of methylated CpG - 
containing nucleic acid in the sample is indicative of a cell 
proliferative disorder. 

13. The method of claim 12, wherein the cell proliferative disorder is 
selected from the group consisting of colon cancer, lung cancer, renal 
cancer, leukemia, breast cancer, prostate cancer, uterine cancer, 
astrocytoma, glioblastoma, and neuroblastoma. 

14. The method of claim 1, further comprising performing representation 
difference analysis, wherein said representation difference analysis 
comprises hybridizing a driving nucleic acid as a driver. 

15. The method of claim 14, wherein said representation difference 
analysis uses nucleic acid isolated from a member of the group 
consisting of normal colon, normal lung, normal kidney, normal blood 
cells, normal breast, normal prostate, normal uterus, normal astrocytes, 
normal glial and normal neurons. 

16. A nucleic acid identified by the method of claim 1 . 

17. A vector comprising the nucleic acid of claim 16. 
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A method for detecting an age-associated disorder, associated with 
methylation of CpG islands, in a nucleic acid sequence of interest in a 
subject having or at risk of having said disorder, comprising: 
contacting a nucleic acid sample suspected of comprising a CpG- 
containing nucleic acid with a methylation sensitive restriction 
endonuclease, under conditions and for a time to allow cleavage of the 
nucleic acid; 

contacting the sample with an isoschizomer of said methylation 
sensitive restriction endonuclease, wherein said isoschizomer of said 
methylation sensitive restriction endonuclease cleaves both methylated 
and unmethylated CpG-sites, under conditions and for a time to allow 
cleavage of methylated nucleic acid; 

adding oligonucleotides to the nucleic acid sample under conditions 
and for a time to allow ligation of the oligonucleotides to nucleic acid 
cleaved by said restriction endonuclease; 
amplifying said cleaved nucleic acid; 

adhering the amplified digested nucleic acid to a membrane; and 
hybridizing the membrane with a probe of interest. 

The method of claim 18, wherein the sample is selected form the group 
consisting of brain cells, colon cells, urogenital cell, lung cells, renal 
cells, hematopoietic cells, breast cell, thymus cells, testis cells, ovarian 
cells, uterine cells, serum, urine, saliva, cerebrospinal fluid, pleural 
fluid, ascites fluid, sputum, and stool. 

The method of claim 18, wherein the probe of interest is a nucleic acid 
sequence. 
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21 . The method of claim 18, wherein the nucleic acid sequence is selected 
from the group consisting of a p 16, a Rb, a VHL, a hMLHl, and a 
BRCA1 nucleic acid. 

22. The method of claim 21, wherein said nucleic acid sequence is a pi 6 
nucleic acid sequence. 

23. The method of claim 1 8, wherein the sample is a tissue sample or a 
biological fluid sample. 

24. The method of claim 18, wherein the probe is detectably labeled. 

25. The method of claim 24, wherein the label is selected from the group 
consisting of a radioisotope, a bioluminescent compound, a 
chemiluminescent compound, a fluorescent compound, a metal chelate, 
and an enzyme. 

26. The method of claim 18, wherein said age-associated disorder is 
selected from the group consisting of atherosclerosis, diabetes melitis, 
and dementia. 

27. The method of claim 18, wherein said age-associated disorder is a cell 
proliferative disorder. 

28. The method of claim 1 8, wherein the nucleic acid of interest is a 
member of the group consisting of SEQ ID NOs:l-33 (MINT 1-33). 
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29. The method of claim 27, wherein said cell proliferative disorder is 
selected from the group consisting of colon cancer, lung cancer, renal 
cancer, leukemia, breast cancer, prostate cancer, uterine cancer, 
astrocytoma, glioblastoma, and neuroblastoma. 

30. The method of claim 18, further comprising performing representation 
difference analysis, wherein said representation difference analysis 
comprises hybridizing a driving nucleic acid as a driver. 

3 1 . The method of claim 30, wherein said representation difference 
analysis uses nucleic acid isolated from a member of the group 
consisting of normal colon, normal lung, normal kidney, normal blood 
cells, normal breast, normal prostate, normal uterus, normal astrocytes, 
normal glial and normal neurons. 
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32. A method for determining the response of a cell to an agent, 
comprising: 

a) contacting a nucleic acid sample suspected of 
comprising a CpG-containing nucleic acid from said 
cell with a methylation sensitive restriction 
endonuclease, under conditions and for a time to allow 
cleavage of unmethylated nucleic acid; 

b) contacting the sample with an isoschizomer of said 
methylation sensitive restriction endonuclease, wherein 
said isoschizomer of said methylation sensitive 
restriction endonuclease cleaves methylated and 
unmethylated CpG-sites, under conditions and for a 
time to allow cleavage of methylated nucleic acid; 

c) adding an oligonucleotide to the nucleic acid sample 
under conditions and for a time to allow ligation of the 
oligonucleotide to nucleic acid cleaved by said 
restriction endonuclease; 

d) amplifying said cleaved nucleic acid; 

e) adhering the amplified cleaved nucleic acid to a 
membrane; and 

f) hybridizing the membrane with a probe of interest. 



33 



The method of claim 32, further comprising performing representation 
difference analysis, wherein said representation difference analysis 
comprises hybridizing a nucleic acid as a driver. 



WO 00/26401 



PCT/US99/25251 - 



-79- 

34. The method of claim 32, wherein the agent is selected from the group 
consisting of peptide, peptidomimetic, chemical compound, and a 
pharmaceutical compound. 

35. The method of claim 32, wherein said agent is a chemotherapeutic 
agent. 

36. The method of claim 32, wherein said methylation sensitive restriction 
endonuclease is SmaL 



37. The method of claim 32, wherein said amplifying is by polymerase 
chain reaction amplification. 

38. The method of claim 37, wherein said amplifying by polymerase chain 
reaction amplification comprises annealing primers complementary to 
said oligonucleotide. 

39. The method of claim 32, wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:34 (RXMA24) and SEQ ID NO:35 (RXMA12). 

40. The method of claim 32, wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:36 (RMCA24) and SEQ ID NO:37 (RMCA12). 
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41 . The method of claim 32, wherein said cell is selected form the group 
consisting of a brain cell, a colon cell, an intestinal cell, a urogenital cell, a 
lung cell, a renal cell, a hematopoietic cell, a breast cell, a thymus cell, a testis 
cell, an ovarian cell, a uterine cell, an exocrine cell, and an endocrine cell. 

42. A kit useful for the detection of a methylated CpG-containing nucleic 
acid comprising carrier means containing one or more containers 
comprising a container containing oligonucleotides for ligation to 
nucleic acid, a second container containing a methylation sensitive 
restriction endonuclease and a third container containing an 
isoschizomer of the methylation sensitive restriction endonuclease. 

43. The kit of claim 42, wherein said oligonucleotides comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:34 (RXMA24) and SEQ ID NO:35 (RXMA12). 

44. The kit of claim 42, wherein said oligonucleotide comprises a 
sequence as set forth in a member of the group selected from SEQ ID 
NO:36 (RMCA24) and SEQ ID NO:37 (RMCA12). 

45. The kit of claim 42, further comprising one or more containers 
comprising a primer complementary to said oligonucleotide. 
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46. A kit useful for the detection of a methylated CpG-containing nucleic 
acid comprising a carrier means containing one or more containers 
comprising a membrane, wherein said membrane has a nucleic acid 
sequence selected from the group consisting of SEQ ID NO:l, SEQ 
ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO: 8, SEQ ID NO: 
9, SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 17, 
SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:23, SEQ 
ID NO:24, SEQ ID NO:27, SEQ ID NO:30, SEQ ID NO:31, SEQ ID 
NO:32, and SEQ ID NO:33 (MINT 1 , MINT2, MINT4, MINT6, 
MINT8, MINT 9, MINT10, MINT14, MINT15, MINT 17, MINT19, 
MINT20, MINT22, MINT23, MINT24, MINT27, MINT30, MINT31, 
MINT32, and MINT33 immobilized on said membrane. 

47. An isolated nucleic acid comprising a member selected from the group 
consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:4, SEQ ID 
NO:6, SEQ ID NO:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 14, 
SEQ ID NO:15, SEQ ID NO: 17, SEQ ID NO:19, SEQ ID NO:20, SEQ 
ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:27, SEQ ID 
NO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33 (MINT1, 
MINT2, MINT4, MINT6, MINT8, MINT 9, MINT 10, MINT 14, 
MINT15, MINT 17, MINT 19, MINT20, MINT22, MINT23, MINT24, 
MINT27, MINT30, MINT31, MINT32, and MINT33), and degenerate 
variants thereof. 



48. The nucleic acid of claim 47, wherein said nucleic acid is methylated 
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49. The nucleic acid of claim 48, wherein said nucleic acid is 
unmethylated. 

50. An substantially purified polypeptide encoded by the nucleic acid of 
claim 47. 

5 1 . The nucleic acid of claim 47, wherein said nucleic acid is operatively 
linked to an expression control sequence. 

52. The nucleic acid of claim 5 1 , wherein the expression control sequence 
is a promoter. 

53. The nucleic acid of claim 52, wherein the promoter is tissue specific. 

54. An expression vector containing the nucleic acid of claim 47. 

55. The vector of claim 54, wherein the vector is a plasmid. 

56. The vector of claim 54, wherein the vector is a viral vector. 

57. The vector of claim 56, wherein the viral vector is a retroviral vector. 

58. A host cell containing the vector of claim 54. 

59. The host cell of claim 58, wherein the cell is a eukaryotic cell. 

60. The host cell of claim 58, wherein the cell is a prokaryotic cell. 
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6 1 . An isolated nucleic acid sequence comprising a methylated nucleic 
acid having a sequence as set forth in a member of the group consisting 
of SEQ ID NOs:l-33. 

62. A method of identifying a compound that affects methylation of a 
nucleic acid, comprising: 

a) incubating components comprising the compound and a 
sample comprising a nucleic acid sequence identified by 
the method of claim 1 under conditions sufficient to 
allow the components to interact; and 

b) determining the effect of the compound on expression 
of the nucleic acid sequence. 

63. The method of claim 62, wherein said sample is a cell. 

64. The method of claim 62, wherein said sample is a substantially purified 
nucleic acid. 

65. The method of claim 62, wherein the compound is selected from the 
group consisting of a peptide, a peptidomimetic, a chemical compound, 
and a pharmaceutical compound. 
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Colon ' Caco2 




Probe : MINT2 



Smal 

CCTGCGC CCT CCCAAATC7A AACAAT QISg *ATCGAA CCT 

GCGGCCGTCT CGACTCCCAG CCCCTCCCTC TCCCTTCTCT CCCCTTCGCG CCTSSggCTC 

TCTTTTATTT ATTTATTTGT CTCTCCCCCC ACCCCCCCCT TAGTCTTTCC CTCTCTTTAG 

TCTT AGT AC C TCCTTTTAAT GGAACTC CAG CCACTTGCGT AGTTCCTGCA CCCAG7GCCC 

TGGTGGTTTT T AT AAATACA GGTAAAATAA TACCCAAATT TCAGCCTCAG CTGCAGTTTC 

TTGGAAGAGG ACGACCCTCT TCCTCTCCTT CCCTCCCTCT TTCCCCCTCC CCCGTTTATT 

AGACTATCCT CGGTGCAAAG TGCGAGAAAA ATCTCCTCCA TCTCTGACTC A C CTTTTCTT 

CGCGCCGAAC TCTACCACCC AAGTTCCCTC GCGGATTTTG CACTCTGCCA AAGTCCTGAG 

TTCCTCGTTT ACTTTCAAAG CCTGAAATAT AACTGATGTC CAACTGCAGA AAGGCGCACG 
GAATCGCCGC CACCAT CCCG GG 

Smal 



MINT2 : 562 bp 

O 

§ Nl Tl N2 T2 N3 T3 N4 T4 N5 T5 N6 T6 N7 T7 N8 T8 
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Figure 3 
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MINT1 (SEQ ID NO : 1 ) 
CCCGGGCTGG GTACCTGGAC 
GGATCCCTCT GCAGACGTGC 
AGAGCCTGAA AGAAACACCA 
GTCTGGCGTC TAAGAGAGAG 
GCTCTTTGTC AGCGCCTGCA 
GGGCGCCTGC TGTGCGCCAG 
GAGGGACCAA TCGCAGTAAA 
AAAGNCATCA GNCGACCTCT 
TAGAACTGGG AAGANTTTTC 

MINT2 (SEQ ID NO: 2) 

CCCGGGCGCT GC CAAATGT A 

GCCGCCGTCT CGAGTCCCAG 

TCTTTTATTT ATTTATTTGT 

TCTTAGTAGC TGCTTTTAAT 

TGGTGGTTTT TATAAA.TACA 

TTGGAAGAGG AGGAGGGTGT 

AGAGTATCCT CGGTGGAAAG 

CCCGCCGAAC TCTAGCACCC 

TTCGTCGTTT ACTTTGAAAG 

GAATCGCCGC CACCATCCCG 

MINT3 (SEQ ID NO: 3) 
CCCGGGTTTC CAGCTTCTCC 
CTTCAGTCTG TTGGGGGNTA 
TCTGCTGTGA GCCTGAATGC 
CTGTGTGCCT AAGACGCTGG 
TGCCTGTGTG TCTGGATCTG 
TAGGTATACA TGT AC CTATG 
TGGGTGTGAT GGTGTATGTG 
GCGTGTGACT GTGTAAACTG 
CGTGTGTCTC CGTGTGCCTG 
ACTCACACCC TCCAAGCCCC 



FIGURE 6 A 



CTATACCTTC ATAGCTGCCT 
AGGTGGCGGG AGAGCAGAGG 
TGAATTTTCA AACTCTCCCA 
CAAGAGAGGG CTGGAGAGCA 
CTTCCTACGT TACAACGCCT 
GCCAGGCGAA GNAGACCGAG 
TAAGCTACCG AGGTAATCTT 
GACCTTTCTC TTAGGGGGTT 
TCCAGAGCGT CGCGGGGAGC 



AACAATCCGC CATGATTTCT 
GCGCTCCCTC TCCCTTCTCT 
CTCTCCCCCC ACCCCGCCCT 
GGAAGTCGAG GCAGTTGGGT 
GGTAAAATAA TACCCAAATT 
TCCTCTGCTT CCCTCCCTCT 
TGCCAGAAAA ATGTGCTGCA 
AAGTTCGCTG GCGGATTTTG 
CCTGAAATAT AACTGATGTC 
GG 



CTTCCACCTT TGTCTCCCCT 
AGATCTGGGA AATAAGCGTG 
GAGTCCGGTT GGTTGTGCAG 
GCAACTGTAC CCACATGACC 
CGCGTGGGTG TAGTTCGTGT 
TCTCAGGCTG TGTGCGTCCT 
ATCCTGTGTC TGCGTGTCCG 
CGAACATGTA CGTGTGCCCG 
TGAGGGGTGG GGTCTGCGCG 
GGG 



TAGGCTCAAC TTTTCGGCGG 

TAGCCGCAGT AAGTGCTGAG 

CATACATTCC CGAAGCGCCT 

GGGGAGCCCG CGGGGCTGAG 
TCATTCAGCA AAAACCTTTT 

GNTGTGAAGC T C AG AGGGGA 

AGATGGNGAT GAGGGCAGGA 

TTCCCCTTCC GCCTGGGTTC 
GCCCCGGG 



TTGTTTAGCT AATCGAACCT 
CCCCTTCCCC CCTCGCGCTC 
TAGTCTTTCC CTCTCTTTAG 
AGTTGGTGCA GGGAGTGCGG 
TCAGGCTGAG GTGCAGTTTC 
TTCCCCCTCC CCCGTTTATT 
TCTCTGAGTC ACCTTTTCTT 
GACTCTGCCA AAGTGCTGAG 
CAACTGCAGA AAGGCGCACG 



CCCTCCTACA AACTTTCAGC 
TGTGTCCGAG TGC CTTAGGG 
TTATGAACCT GTGTGTACAT 
GATGTGTGTG AACGACTGTG 
GGCTGTGTAA ACCGCATGCA 
ACTGCGATGG TACGAGAGTG 
TACATTTGAG TGGGTGTTGT 
CCCGTAGGTA TTACCGTGTA 
GGGATTCCCG ACCCCCCCAC 



MINT4 (SEQ ID NO: 4) 
CCCGGGCCTC TGGCCCTCTG 
GAGAGNACTG GTGNTCCCTG 
G CTTGAGCGC CAGCAACAGG 
CTCTCCATCC TCATTTCTGG 
CCCTTGACTA TCCAAAGCAG 
ATGNCCCGCC GGGTGGCTCC 
AACTCTTGGT TGAAGACTCT 
NGCACGGGAG GGAAAGTGGA 
G 



CGTCTGCTAG NCTCTTTCCC 
GAGAAATCAA GGTGTCCAAC 
NCCTGCTGAA CTTTCTTCCC 
GGTCAAATGG NAAAGAGGGA 
CCCGAAGTTG GCGAGGAGAC 
AGAAATGGNC TGTGANCTGC 
GACTCAGAGG AGCCCTCTGA 
GAGAACTCGN CCTCCCCAGG 



CCAAGACTCC CCGAGGTGGG 
ATTCTCTCCG AGGCGAGGCT 
CGGCTCCTAC GCTCCGGTTG 
ATACTCCTCG ACCCCTCTCC 
TCTGCCGGGT GTNCGGGCAA 
ACTCGCCTCG GAGAAATTCC 
GGATGCGCCC CTGGAGAAAG 
GGCTAGNCAG CTACTCCCGG 
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MINT5 (SEQ ID NO: 5) 

CCCGGGTAAG TGAGCCCTGC TGCACTCCCG 
ACCCTCCGGG GATGCAACAC CCTGTTCCCA 
TCACACTGTC CCCACCCAGC TTCAGGCTTG 
CTGAGGAAGC AGTCCCAGGG CATTTACTGA 
GGGAAAAGTG AGTAGGTGGG TCTGCAACCG 
TCTTAATTCG AGTATATAAG GATTGGCATC 
ACTGTCCTTG TCATTTGGGC AAAAGACCCC 
ATTCCTCANA AGTGTCCTCT TGGAANAANC 
TCCTTAACAC TCTCCGGCTT CCCCTCAATC 
CCTCAAGACC TCCTTTAATC CCAAAGAGNC 
GGTGCTTTCC TTTCCAATCA AGAATATGTT 
ATGGACTATG TCCCCAATTT TAAAAGATGG 
ATGCTAAAAA CCATGGGAAC AGGATCCAGA 
CTGCCAGTCT GTCCTCGGAG ATCCTTTGAC 
TGTTGTGGTT TGGGGGTTGA TCTTGGAGAA 
AAAATGTTTC ATTNGTTAAC TTTC CAAGTG 
TCAACACCAA GAAGAGTAGG GAAAGAAGCA 
GG 

MINT6 {SEQ ID NO: 6) 

CCCGGGCCCT CAGAGGCCGC ACCACCTATT 
GCGAGAGGTG TGTGGGGGCG GAGAGTGGCA 
GACAGTGGCT GCAAAGGCAA AATCGGGTGT 
GCCGGGGTCA GCTCGAACTG GAGCCTGTAA 
GATCCTTTTC ATAGGTGAGG TCCCAGGAAC 
TCCCACTTGT CAGCCTTGTT GTTTACCCAT 
CGNNGTCGGT TCCATCAACC CCGGNATCCC 

MINT7 (SEQ ID NO: 7) 

CCCGGGACCC GCCCAGAACG CTTTCGTGGG 
GCCCGGTAGT ATGCGAAGAC ACGCATAACG 
TTAAGGCCCT GAACACATTG AGCAAAAAGT 
GTCTTTCATA GTAAGGACTT TATTAAAAAG 
GAGGCTGGGG CTGGGGATGG GGACGTCCTC 
TCTCTGCTGC CTACTCCTAA ACGCAGCCGG 
CAGTCCCAGA GCCAGGTCGT CCATGGGGGT 
CTAGAAGCGA ATGGATTTTA TTCTCCCAGG 
G 

MINT8 (SEQ ID NO: 8) 

CCCGGGCTGT GGCGGTGCAA CTCCCGCCGC 
CCATGCATTA GTAGTACTAC TCCATTATTC 
GCCCAAAGGC CCCAGGAGGG GGGTCATGTG 
TCCGGTTCGA GTTATGCCAT CAAGCTAATA 
AAGGCTTGCA GCTGCCTCCA AATCAATAGA 
CACAAAAACT TAATCCTGGN TTGGAGGCTA 
CAAGNCACCC GATTTAATTT ATCCCCAAAC 
TTTTCCCAGC AGATCCTGCT ACGTCTGTCG 
TTCATGTGGT CCGGTGCCTT GAACCATCTT 
AAGAAAGACA ATTACCAGAT GGTCTTTTTT 
CGGGGTCTGT CCCCGGG 



CACCCCTCTT CCCCATGCCC 
TGGAACACGG GGGTTGGCAG 
GTCTCCTCTA GGTTTGCCTT 
CCAANCAGAA AACAGGGGTT 
TTACAATCAC ATCACTTTAT 
ATANTGGGAT GANGAAGGTT 
TACCCATATC TCAATGACCA 
TGANTTTTCC CCTCCGTAAN 
CCAGGCCTTC CCCCTATTGA 
GTTGACTTCC NCCAAATGCG 
TAAAAACCCT CCCAGGGAGT 
AGGAACAAAG GCCCATTGGT 
TTTCCCCCCA TCAATTCGAN 
TTCTTGGAAT ANCCTTTTTG 
CTTTTTTGTG TGTCTTTTAA 
ATGCTCTGAT TGGAGCAATC 
GCGGNGGTCC TGGGTCCCCG 



GTGTTCCAGG CTCGCAGGAA GCCAGACCTT 
CAGGTTTGAC ACTGCAGGTC GGAGGAGGAA 
TATTTTCCCA AGAGTCCCTT CAGCGTGAGT 
TTTGTGAGTG CGAGTGGGGA GCAGCAGGAG 
GAGCCTGGTC NGTGCTTAGG CAAAGGCCCT 
CCCCTGCTTC TCCCAGACTT GCATTAATTC 
CTCCCCCGGG 



GTTGGAGAGG GCAGGACACA GCCTCTCTGG 
CAAAAGGATT CCCGTCCTGG ACTTTGGGAA 
AGATCTGTCT GTACAGACGT TTCTTTCCAC 
CAGGCACTCG AATCCTAGGT GGGTAGATGG 
TGTTTTCTGG TTGTGCACAT TAAAAATAAC 
CAAAAATGAG ACGTCAACTA AGCGCCGTTT 
TTTCAAGCGT TTTCTCGATG ACTGATTTTT 
TGTAAAGGCT ACCTCCTGCT TCACACCCGG 



CTGCGTTCTA GACAGAAAAG CCCCTTCTGA 
CTGTTAGAAC AAGTTAAAAG TAAGGGTTGA 
CGCCCCAGTC ACTCAGGCTC CCCTCGCTTC 
TATTGTGACT GCTCTTCTCT CCTGTGACAA 
TTGTCAAAGA AATATTGAAA ACAATCATGA 
CATAATCAGA AATTGTGCTA CTTGTTCTTC 
TTTTAGGCAA ATTTTTATTT CCCGGGACCT 
GGTTTGTAAT GTAATTTGTA ATTNCTNCCC 
TAATTAAAAG CATAATTAAG GGAAGATCTA 
TTAGAGGCGG TAGTTGCGCA GAGAGGGGCT 
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MINT 9 (SEQ ID NO: 9) 
CCCGGGTCAT TGTCCATCTC CGACCAGGGG 
ATCTCCTAGA AGGGGAGGTT ACCTCTTCAA 
CCAGCCCCAC TACGGAATGG GAGCGCATTT 
TAGGAATCTA AGAGTGTGAA GAGTAGAGAG 
CCCTCTCTCA GCAGTGGATG GGGATGGGGT 
GAAACAGCTT CAAGACTCTT TGGGTTCCTC 
ACTTTAGACA CACAGGTTTG GAGGGAGGAA 
TTAAGAGCAC TTCCTCTTTT CTTAACTTGG 
GAAATCTTGG GAGGNAGACA T C AG ATGTNG 
GCTTTGGGTC AANAAAGATC AAGCCTGTGT 

CCGGG 

MINT10 (SEQ ID NO: 10) 
CCCGGGGGCT GCAGAATCAG GAGTCTTCNC 
CCTGAATGTG ACNGCTGCGT CTTCGTCCCA 
CGCTGGC C AG GACNGAATAT TTTTATGGTA 
GAGGGTGGGA GCCGTCACCC GAGGTCGCAC 
GTCACTGCTC CCCCATTAAT TAGGGGGGAG 
CCACGGNCAA AAAGCTTGTC AACATTTTCC 
AGATTATTCA ATGTCACCAA GGTATGGAAA 
GTTGTGGATT TAAAGAGCTT TTTCTATTAA 
AGAGTAATTT AAACAATTAT GGGCTTAAAG 
GCTGCAGGCT GATNNCCCTG AAAAACCTCT 
AGCCCGGG 

MINT11 (SEQ ID NO: 11) 
C CCGGGAGTG GCTAACCAGG AANANNAGGC 
AGTGGCCTGC ACCAAGGCGG CTTCGGGGGA 
TTCAAACTTT TGCCTCGAGC GTGTTAAGAA 
GGTCCCTCTC TTCTCTCCAG CTCAAGAACC 
ATTCCCTCCC TTCCCCCAAT TCCGTTAGTC 
TTGTCTGAAT CTTTTTCAAC AC CAAGGT C C 
GAAAAGTTAC CTGCATTTTT TAAGTGCCTA 
CCTTAATTAA AGCCTTCTAC CAATTGCTTC 
AAGTTTTTTG TTTTTGTTTT TTAAGGGGGG 
GGCTTATTGC AGTTTGGGGG GAAAATTCAC 
CCCAGGCGGG TCACATAGGA AGCGTGGTGG 

MINT12 (SEQ ID NO: 12) 
CCCGGGTCCC AGCCCTGAGG ACCAGGTTTC 
GAGGTTCCCT CGCAGATTGT GTCTGCGGTC 
GCCTTCAGCA GATTATCCAA AGGTCAGTGA 
CGGGCCATGT TTCACTTCCT GTGCCCCAAG 
CGGAC CCCAA ACCAAACAAA ACGCTCTTAT 
GAGTTGGGAT TTTTCTGTCT CAAATTAGAA 
CTTAACATTT ATTTGTTGGT TGGCTGCCTG 
NTTCGTCTGT AAATTGGGAG CTTACCCAGG 
GAAATATTCG AATAAGGAAG TGTTTTTGCA 
GCGCTTCTTC AGGGGTCAAT TTTTTTTAGC 
TCGGAAGATC GTTGTGCCTT TCTTGGATGA 
GG 



AGTAGCCACC CCCACTAGCC AGCCGTCTTT 
ATGAGGAGGC CCCCCAGTCC TGTTCCTCCA 
TAGGGTGGTT ACTCTGAAAC AAGGAGGGCC 
GAAGTACCTC TACCCACCAG CCCACCCAGT 
GGGGGTAGGG ACGAGAAGGC AGCTGGTGGA 
CTGCTCTCCA GGGGAGCTTA CCTGGGGCTA 
AGAAGGAAGA ATTCTTTTAC AACGAATCAA 
GGGGAGGGCC AGGAAAACTT CTTGAGTCAA 
GCAAGAGGCA GACAGATTTT GGGAAGGCAG 
TGTCCCCAGG CTCTTCCCTG TTCCCCCATC 



CTAGGTTTGG CCTTGGGCTC CATCCTACNC 
CACCTACTTT TGAATGCCAA GAGGGGGCTC 
AAAAATGACC GGCAGTTGCA TCAGCTCCAG 
AGGCAGACTG ATGAAAATTC TGCTTATAAA 
GGGGCGCTCC GGAGCCACCA CGCACCTCGC 
ACGAAGGATT GAAAATGTAA ATTAACTTTC 
AAGGTCGCCA TACTGGGTGT CATTTATCTC 
ATTTCTTAAA ATTAATGTTT TATGTTGCTC 
AATTGATCAT TACAGCCCCT GGGATTTAGC 
GATTTATCAG GGNTCGTATT NGGCCGGGCA 



ACTGNCCACA CACCANGGGC TGGGAAATCA 
CTTGTCTGTG GCAAGTCTTG GTAGTCCCCA 
CAACAACAAA AAAAAAATCA AAGTGCCAAA 
CACCACTTTT CTATGATTTC TTTACAATTT 
ACTTTACCCC CACCCCACCC TGGGTTTCTT 
CTCTGTATGC CTCTCCCCAA AAGCCCTTAT 
CATTTCTTAA . CTTCGCCTAA CAGCTCTTTG 
TTTTTTCTAA GCTCGCGGGT TTTTTTCAAT 
AACAAAAGAA ACGTGATTAC CTTGGAAGGC 
TGCAGCGCTG CGCGACTGGG TTCGGCGTTG 
CCCGGG 



AGGGCTCAGA AGACTCCAGC 
GTTGGGGGAG GGGCCCCGCA 
CCCAGATATG GTTTTGGNCA 
CAGAATTTAG CTGAATAATT 
TTCCGTTTGG GGATTCTTCG 
TAATNTGCAT TATTAACCAA 
ACCTTTCTGA GCCTCAGTTT 
TCGGAGGACT GTTGGAATTG 
AGTGCTTTGT AAGCAGCAAA 
TCTGCAGTCA CCACCCAAAT 
GAATGCCCGG CTCCAGCCCG 
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MINT13 (SEQ ID NO: 13) 

CC CGGGTGTG TGTTGTTCCC CTCCATGTAC CACGTGTTTG TCCTGATGGT CTCCTACCCC 
CTGTCCCCCT GAGAGGCCCT GGTGTGTGTT GTTCCCCTCC ATGTATCCAC GTGTTTGTCC 
TGATGGTCTC CTACCCCCCG TCCCCCTGAG AGGCCCTGGT GTGTGTTGTT CCCCTCCATG 
TACCCACGTG TTTGTCCTGA TGCTCTCCTA CCCCCTGTCC CCCTGAGAGG CCCTGGTGTG 
TGTTGTTCCC CTCCATGTAC CCACGTGTTT GTCCTGATGC TCTCCTACCC CCTGTCCCCC 
TGANANGCCC GGG 

MINT14 (SEQ ID NO: 14) 

CCCGGGAGTG GCCCTGCCTG G C C ATTTGCT CAAGCAGCAT GCAAGCCGGA TCTACAGCCG 
GTGCACCTTG TTCCTTGTTC TTCCGGGCAC CGGAGGCCCA CGTGAAACCT CTCAGAGGAG 
AGCGAAGAAG GCCACCTTTC ATCTCAATGA GCCAACAGCC TCACATC CTT AAGTCCTGCC 
TAATCTTACG AGTCATGAGT CATCGCTTCT TCCCCGCCAA GTCATTCAGG ATTCAGTGAC 
TCCAGCTGCC CTCAGGATCT GACGGAATCT TGAAGGCAGA GTTCCGAGAC TGCAGTCCAG 
GATAGAGAAG CCCCCGGCTC CATCAGGGGC TCCTCGGCTT CAAGGCAGGA CCCACCCCAA 
AGCTNTCAAA GCGGCAGAGG CCTGTTTTCA GGTCTATTTT TAAAGATNTC TAGGGGAAGT 
GGTTTCTGAA TGCTCTGTAG GATGAGGCTG TGAACAGTGA TTGTTTCATT TGCTCGGGGC 
TGGCAAAAAA GGGATGTACA ACCCGTTCTG ACCACACCAG TGAGTTAAAA AGCACTGCAG 
ACTATGAATA CCTTTTGCCA GTTGGACTGG TTATGGGGAC CGATGGCTTC CCCTACAACT 
GGGGAGGCTG CCAGCCCGGG 



MINT15 (SEQ ID NO: 15) 

CCCGGGACCT CTGCGGTTAC CTGGGCCTTC CCTGCCAGCC CCCACCCCCT GCCCCCACCA 
GAAAGCGTTT ACAGGACAGG TGAGGCCCGC AGGAGGAAAA GCACTCCCCT GGCGCAACAT 
GACTCCAGCG CATCTGCGTC TAAGCCACAC CGTGCTCCTG GTAGATTAAA AATTAATTCT 
AAAAAAAAAA TCTCTCCTAT CCCAAATGCA CTGTTTTCTG CCTTGCTTGA CAATTGATTT 
GTTTTTAAAG GAAAGTTATG GGTAGATCCT CTTTTTTCTT TCCCATTCTT TNNTTCTTCT 
TTTATACTGG AGGGAGGGAA ACGGAGGCGA GGACACACAC GCGCAGGCAG GGGNTGAAAA 
GGCCGAGGTG GGTTTTCCTG TTTAATATCA AAGGAGGGCG AATAATGGGT TTCCTCGGTC 
CGGCTAGGCC GGCCTTTGAC TCAATTGGAA ATGCAAAGGC AGCTTTTGCC TATTNTCTGG 
CTGCTGGCTG AGACCCTAAA TTTCCGTAGG AAATCGTCGG ACACGCACTT AATCGGNCTT 
TGCAANCTTT CCCTCGAAGT TGCACGCGGG TCTGGGCGGA GGAGGCGAGG TAACCCTGGA 
TTCGAACCAG CGCCTTTCTC TCCTTCAGGC CTCCGCCCGG G 

MINT16 (SEQ ID NO: 16) 

CCCGGGACAA GGCGGGTCAC CTCTGGGGCC TCAC CGCAGT TCCACTTCCT TTCTCGGGTA 
TTTGGAAACC GTCACCCCGC CATTTCGGTG TGGGAAGAGC GCGCGGGCCC TGCCGGACTT 
TAGTGCTTTA GGGGTTAATT TCGGGCTGAC AGGGACGGAG CCTAAGGCAG TGAGCGCCCC 
AGTACCCTCA AACCTTATTG CTGGCCCCTG CTGTCTGAGC TTACAAGCAT TACCGCCGCT 
ATTTCCGTGC GGGCTGACAC GGGAGATGAA AGTGGTGAAG ACACCCAGGG TGCGGGGGTG 
GAGGTGGGGA GAGGAGCCAG ATGGGATTGA TCCCCAGAGC CAGATGGGAT TTAAAGGTGA 
GGGAGGAGGG CATCCTGATG GCGTGTGGTC AGTTGATGCC AGATTGGATG GCTGAGACAC 
CTCTGCAGCT TACAGGAAAG ACAGAGGGAA AGGGTTCTAT GAATTCTAGC TGTTCATACT 
CAAAGCAAAT AATTAATCAA GTGGGGGGGG GGCCTCTAGC TGTAAACCCA TACCTCTAGG 
AAACCTTTTG TCATGTGGAG CCACAGTGCT CACTTGACAG ATTCCCCACT GAGAAGTGGG 
CTAAGAGGTT GGCCTGCATT GCTGGGTGCC TCCAGGTGGG GAGTCCTGTA CCTGGGAGCC 
CGGG 
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MINT17 (SEQ ID NO: 17) 

CCCGGGCGCC GGGGGTCCCA TCCACTTCTC CCTACCTTCT CTCTCTCCTG TTGGAGAGGC 
CAGGGGCTCC CGCGACTGAA AGGGGCCAGG CTGAGGCTGC CCAATCCCAG GGACAGGAAG 
AACATTTTGG ATGGATCGCG GGGTGCAGAA AAAGGAAGTT GAGCGACAGA CGCNAAGCCC 
AAAGCGTGGA ATTTGGGAAG AGGCAGAAAA AAAGTGGGCG AAGAGGAAGA GAAAAAACNA 
GCNACGAGAT GGGGAGGGAC AAGAAGTTTG GGGAGGATAG AGAAAAAGAG AAACTGCCTG 
GAAAACAAGT TGCAAGACAA TTGAAAAGTT GAATGAGAAG GAAAGANAGG AAGGTNGTNA 
AAT CAAATT A GAACTGTTGG AAAAGANGGC TGGGACACGG CTTTCTCTGG TTCTGTCCTC 
CCAAGAAATT GGAAACCTCT CCCCTTCCCG GC AC CAANCT TNCGGGATGT TCCGGTGCCC 
CTCCCCCGGG 

MINT18 (SEQ ID NO: 18) 

CCCGGGAGTG CTCCTGCCGG CAGCATGTCC ACTTGCTAGG GGCAGAGGGG CAGTGGGAGT 
GTCCCCCATG TGCCAGCCTG TCCCCACACT TGGGTTAACC TTCAGTCACC AGGAGGAATA 
AGGTGTGACA CCTCTGAGAC TGTGGGCACA TGTGCTGCTG CCTGACAGTC ACAGGCTGTA 
CTGTGCCAGT GACAGAGGCC ATGATGCCCA CATAACCCAA ACCCGAACCC GAACCCTAAC 
CCCTAACCCC TAACCCTAAC CCTACCCTAA CCCTAACCCA GCCCTAACTC TAGCCCTAGC 
CCTAGCCCTA GCCCTAAGCC CTAAGCCCTT AAGCCTAACC CCAAACCCCC AACCCCAACC 
CCNAACCCTA ACCCTTAACC CTTCCCTCAA GCCTCTCNAA CCCTGCTTGG GTTTACAAGG 
TTATTNAACC CCGGG 

MINT19 (SEQ ID NO: 19) 

CCCGGGATTG GTCTTTTGGC TGGGATGTAA AGGAGGAGGA ACTAGCTGGG GAAAGGCTGG 
GTAAGGGGGA AAACCCAGGG AATTTAACCC CCTCTTCTGT AAAATGAGAA CCCATTCATT 
CACTTAGCAA ATCTTTATGG AGCTCTGTAG GGGCCCTGGG GGAAGGCGAC CAGAGTGTCT 
GAAAGCAAAG AGCAGGAGAG TGTGGACT C A GCTGAGAGGA GAGCAGAGGC TGAGTGTCAG 
GGTGGGGAGT GGGAAGGGGT TTCCAGGTGC CAACAAGGAC ACAAGCCAAG GTGTTGCAGT 
AGGGCATTTG GGGAACGTAT GAGAAGCGCT CCTGCCAGCT TCCCTGGCGG TGTCCTATTC 
CCTGCATCCA TTTTGGACCA TGAGCCCCTT CTCTTACCCT CTGGCCAGGA CCGAATGCCA 
TAGACTTCCC CAGATTGCCC GGG 

MINT20 (SEQ ID NO: 20) 

CCCGGGCCCC AGGGCGGCCC GAACCCCAGC CAAGCCGGCC AGCAGCAGGG CCAACAGAAG 

CAGAAGCGCC ACCGGACGCG CTTCACCCCC GCACAGCTCA ACGAGTTGGA GAGGAGCTTC 

GC CAAGACT C ACTACCCCGA CATCTTTATG CGTGAGGAGC TGGCACTGCG TATCGGGCTG 

ACCGAGTCCC GAGTGCAGGT ACGAGGGGCT TGGGATCTGG GACAGAAGGC AAGGACAGGG 

CGGGAGGATT TGGGCAAAGG GAGCAGGGTC TTCCCTTCCC CTGTCGAGAT CCTGGGCTGC 

TTTCAGGCTG CCTGTGCGTT CCTGTATCGA GTTATCTCCA TCTCTACCCG GAAACTGGTC 
CCCATCGCCA TCCCCCAATG GACACGCAAG GCCCGTCTCC GGC CAGTATA GCGACATCCC 

GGAAGAAGCT CCTCAAAATC GAAGCCCGGC GTTGTCGGGC TACAGGGCTC GCCTCCTCCG 
CCTGAGAAGG CAACCTCAGC GCCCCCCGGG 

MINT21 (SEQ ID NO: 21) 

CCCGGGAACT ACCTAACGCT AGTTCAGTCC CAAAATGCTG CCCAACGACA GAATGCTCGC 
CTCCTTGCTT CCTCTAACAC TCTGGCACAC CCACTTGGTG TCGGGCCTCT ATGGGCTCGC 
AGTGAAGCCC TGAGCCTGGG CTGCCCCTTC CCATGTGCCC CCTGCCAGCC GGCCCTCCCT 
CCCTTTGGGT GCCCCATCCC TCCAGTCAAC TCCTAGCCGA CCCTTAAGAG TCAGGTATTT 
GTAGCCTTCC CTGACATCCC TCCCAGGCTG TCCCACTGCC AG CAGGACGA GCCTGCCCCT 
CCTCCACCCT CCTTACAGCT ATACCTAGCC TTGGCCATAA TCACTAATGG ACCAGGAAAC 
ACCCTGGCGC GCAGAGCCAC CGCAAAGTGG CCCGCTCAGG CCCGCCCGGG 
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MINT22 (SEQ ID NO: 22) 

CCCGGGGATC TCAGAAAGGG AAGGATGTGG GGAGAGTGAA GGTGGAGGCA GTCACACCTA 

TCCTGGTGAC CTTGGCGTGC CCCCCTGGAG TGACGAGCCA GGGGCTTATA TAGGTGCGAC 

TAGGATGTTG CCTCGTTTTT CACTCGGGGC TGCGAAGAGG GCATTGCTCT TTCTGATGGC 

TGGAAGACAG GGGCAAGGAG AGAGAGAACC GGCCCCGAGA CGGGCTGGAG GGTGGGGACA 

CTGGGGAGTT TGGAGCTGGG GGTTCGGAGT GGGAGGTTTG GGTCTTCTGA GACGCTCCAG 

ACTCTCCGGA GGCGGCAGAG GTCGAGGCAG GAGGCGAATG TGACGCTTAG GGTCGCTACG 

GTTGATGTTG GGCGC CTTTG GAAGCTGGTC ATTAATTCTT GTCATCGGGA GGTTTCGCGG 
ANGGCGACAG CGCCCGGG 



MINT23 (SEQ ID NO: 23) 
CCCGGGCGCC CGGCCCTGGC TCGCGGAATG 
AGCTCAGTCC CAGTTCCAAC C GGGGG1GC C 
ACAGCTAAGA CACCAGGCTG CAGGATCACT 
CTCTCCCGTG CGCAAGAACA AACGCGCGTG 
AGTGAATGCA AAATCCAGGG GACTCAGGGT 
TCAGGGAGCT GTTGAGGTGG GATCGGTGAG 
GGCTCGGATA CCATGCAGCG TGGACACTCC 



GGCGGCCAGA TCTCAGGCCC TGCGTGCCCG 

CATGGACTCT CGGAGGGCAC TCCTGGGGGG 

CATTGCACGC TGCATAATCG CCGCCACAAA 

GGACAGAAAA AGTTCCTAGG TCTCCGCAGG 

CATGTTGGGA GCCCCTTCTC CCCCCGAGAG 

GGTCGCGCCA CGCGGGTCCC TTCCCTACCA 

CGAGTTGCTC TGCGGAATCC CGGG 



MINT24 (SEQ -ID NO: 24) 

CCCGGGGACG GGGAGGGAGG AGGGCTGCCG GGATGTGAAC CGGGGAAGGC AGCTGGGGCT 
GGAGAGCAGC GCGGAAAGGG GGCCCAGGGA GCTGGAAAGC GAGCCAAGAG GAGGGCAAGG 
AAGGTGGCGG GCTACGGGGA GGGGAAAGAA AAAGGGTGTC TTGGCGGTGG CCTTGGTAAG 
AGAAAGGGGC AAGGGGTATA ATTGACAAGG CACTGAAAGT ATTGAAGTCA GAGCCTTGGG 
AAGGATCTAC CGAACTCTCG GCGGTCCACG CGGGGACAGA CCTCAGCCCG TGAGCCTTGA 
GCTCCACGCG GGGACAGACC TCAGCCCGTG AGCCTTGAGC TCCACGCGGG GACAGACCTC 
AGCCCGTGAG CCTTGAGCTC CACGCGGGGA CAGACCTCAG CCCGTGAGCC TTGANCCCAG 
AAGGAGTGGC AACCTCANGA CGTTTGCCAA GTGGCCTGGA ATGTTANGGA AACCCCAGCC 
CCGCCAGGAA CANANCTGGC ACTAATTCCC NGCTCGGNCC GGG 

MINT25 (SEQ ID NO: 25) 

CCCGGGGTGG GAGCTGGCTC GTGGGGGCGT GCGCTGCGCG AAAG CGAAAG CCGCCCGCCA 

GAGCAACTTT GCGGCGGAAG GCGCCGACGA GGAGCTGTGC CGTGCCGCTC TTGGGGATGG 

TGAGCTGGCC GCCCGGCCGG GTGGGCAGCG CGTCCGGGCG CGGTGCTTCG CTAGCTATAA 

ATAGGTGCTG TGCGGGGACA GGAAGATGGT TCCGGCCCTT TACAAGCACC GGCCCGTTAT 

GTGCGCTGGG CTAGGACCTT GCCCCGCAGC GGAGTGGGAG GAGTGAGGTT AGGGGTAACG 

GTTGCATGGG ATGGGGGGTG GGCACATAGA GCCTACAGCA GAGTTGGCGG CGGGGCTCTC 

CCATGCACTT GGTTGTTTGT CGTTTCTGCT TTTCCCGGG 

MINT26 (SEQ ID NO:26) 

CCCGGGCTCC GGCATAGCTC TAGATTAACG AGCTGGGCGA CGGGGGCGGG GGCAGCATGC 
CCAGCGTCGG TGCACGGCCG GGGTTCTTAG ACATCACAAA. CTGTGGAGCG ATACATTGGA 
AGCGAAATCC AAGAACGACA CCGGCCGGCG TGTTTCTGAT GTAGTCGTGA TTAGTGTTGG 
AGATGGCCAA GGGCGGCTTG CGGAGCCCAG GGAACGCAGC GAGCCAGGCC CGCGCCCCCC 
TGCAAAGCTC TGCCTTGAAC CACGTAATTC AGGCACCCAG GGTGTCCCTC CCTAGGTCCT 
GGCCACATTT CCCGAAGGTT ATTAAATGGA GAATTCAGCA GTGGAGTTAG AGACGGACGA 
ATGTACAGGA AGATAAGAGG GAAACTCTTC CTCATTTGCT TTAGGGGGTT GTGCTGGGAA 
AATGCGGGGA GGTTAAACAA AGCTTCTCCT ANGACTCCAC GATTCATTTC TAACAACTTC 
TTAAAAGACT CCCGTCTCNG GAGCAGACGC NCCCTCCCCG CTCTCTAAGC CCCGCTGCAT 
GAAANGATGC CCTCGGCCCC TTGC CAGAGG GCCGGGTCCG GGATTCCCGG G 
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MINT27 (SEQ ID NO: 27) 

CCCGGGATCC GGGAAGGCTC CCCCGAGCCG GGGTCGGAAC TGCGGCTGGA GGGGCCTCGG 
CTTGAGGAGG ATCTGGGAGG GCGGGGGCTC AGTCCTGGCC ACCAGGTGTG AGGGGTCGGG 
TGCGGAGCCC TGTGTCAGAC GCGGCGGTGA AGGCTGTAGC CCTGCTCTCC GGGATGGGGG 
TGGTACTCTC ACCGCACCCT GCCCCAAGCC GCAGGGAGCC CCTGGCGCCC CTGGCGCCCG 
GG 

MINT28 (SEQ ID NO: 28) 

CCCGGGCCCC TGGCGGGGAC ACCGGGGGGG CGCCGGGGGC CTCCCACTTA TTCTACACCT 
CTCATGTCTC TTCACCGTGC CAGACTAGAG TCAAG CTCAA CAGGGTCTTC TTTCCCCGCT 
GATTCCGCCA AGCCCGTTCC CTTGGCTGTG GTTTCGCTGG ATAGTAGGTA GGGACAGTGG 
GAATCTCGTT CATCCATTCA TGCGCGTCAC TAATTAGATG ACGAGGCATT TGGCTACCTT 
AAGAGAGTCA TAGTTACTCC CGCCGTTTAC CCGCGCTTCA TTGAATTTCT TCACTTTGAC 
ATTCAGAGCA CTGGGCAGAA ATCACATCGC GTCAACACCC GCCGCGGGCC TTCGCGATGC 
TTTGTTTTAA TTAAACAGTC GGATTCCCCT GGTCCGCACC AGTTCTAAGT CGGCTGCTAG 
GCGCCGGCCG ANGCGAAGCG CCGCGCGGAA CCGCGGGCCC GGG 



MINT29 (SEQ ID NO:29) 

C CCGGGAGCT GACCGTGGGG AGGCCGGTTC CGCTGGTTTC AACCAGCCCA CTCTCCCCTC 
TTGGGATGCC CAACCCCGCG TACTCACCAT TTCCTGGTTT CCAGGATGTC CTGGCATCTT 
AGCTATGCAT TTCCAGTACC TCCAGACCTC AGGGCAACAA AGGATGTGAC AAAGTCACCT 
AGTGCTCCTG AGGGGCAACA CGCGGGACAG TAGAGATGGA TCTCAGGCTC CGGCCTGAGC 
CAGAGACAAA GGCCGCGCCA AACGCTGGAA GCCACGCCCT CCTCCCCAAC TGCGTGCCTG 
ATAGGACGGT TCCTACTCTG ACAGATTGAA TAAGGCTCCA GGACCCTCGC CCACACCCAC 
CGTCCCCAGC ATTAGTGCGC TTTATGGACA GGGAAACGGG ATCCTGTANG CGGGGTCACA 
CGCCCCGGG 

MINT30 (SEQ ID NO: 3 0) 

CCCGGGCACC TGGGCTGGGG GGGCACTCAC ATGGCTACCG GAGGCCCCCA CGTGCGGCGC 
CCCGCGGAGA CAGGGGTTCG CGTTCAGAGC TGGTGGCGGA TGGACCAGGT GGCCGCGGGG 
ACCAGCTGGG TCCAGATGTG CTGGGCCTGC TGGAAGGGGA CAGGTGCTAC CTGGACGTGT 
AATGGCCCTT GGTCTCTTTT GGCCGAACCT GCCGCTCCGA TCCCCCTCCA TCCCTTCATC 
CCTCCATCCC TCCATCCCTC CATCCCTCCG GCCCTTCTCC CCTTCTTCCT CCGCTGCCTG 
TGTTGCAGAG AGGGGCTGTC AGAGACTGTT GATGTGGGAA AAAATGAAAT GGGGGAGGGG 
TTGTGATTGG CAAAGGCCAG TTGTGC CGGG AGCGGTGGGT AGAGGGGGTG CCCTGAGAGT 
GGGAAGCCCT AAACTT GGAG GGCAGCGCTG ATGGGGAGAG GGTTCCTGGC ACCCCCACCT 
GCCTTGGAAG TGGGAAATGA CATAGCGGGA GGGGGGCTGC AGTTCCAGCC CCCGGG 

MINT31 (SEQ ID NO: 31) 

CCCGGGGCCT CTATCCTGGC GGGAAGGGCA GGCCGACCCG GCAGACTGCG GCCTCTCGGG 
AGGGAAGAAG GTGTCAGACG CGCGGAGCAA CCATAAATAG CCCCCCTTTC CCAGAAGACG 
GCACGGGGTT CAAGACTCAG GCGCCGCATA CTCAGAATGA GAGCAGAGAC TCCCGCCAGG 
AAAAAAGGGC ACTTAGGGGA TCTGCTCATT AACATGAAAT GCAAATGAGC CCGCCCGGCC 
TCATTTACAC AACTCTGTGC ATGGATTCGG CGAAAGGGCA ACCAGGGAGA CGACGGCGCA 
GCAGCCACTC TGCCACTTCC CCCATCCCCT CCCCCCCATC GGCCGGGGCG GGAACTGAGA 
CGACCCCAAC CCTCTGCGGC GGCGGGAGGT GCGCGGGGGC TGCGTGGGTG GTGCAGCCTT 
AGGGGAGTGA ACAACGCCCA GGGGTGATGG CCTCAGCAAA GTGAGGGGTG GTGATGGAGG 
TCATCCGACC CATCCCGCCG CCTCTCCGCA GTGGCGCAAG CGCCCCAAAA TCTCCGGAGA 
NGGAACTGAG T GAC C C ACTA GGTTCCGCCG TGTCTACCTC TCGCAGATGT TGGGGAAGTG 
CTTCCCGGCG TCTAATCCTC GCTGTTCCCC CCTCCACCGG CGCCCAGCAC ACCCGCGGCG 
CTCCGCTCCC GGG 
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MINT32 (SEQ ID NO: 32) 

CCCGGGTACC TGCACAGCTC GCTCCCTCCC ATCCTTCGGG TCTTCGCTCG AACGTCCGCT 

CCTCGGTGAG GCCTTCCCTG GACAACGCAT TTGAAACGTA ACCCCAAGGC AAGAAGCCAC 

CTTCCAGGCG CGCAGCCGAA GCCCAGTGCC AAGGAGGCCG GAGACTCGGG TGCCCGCGCA 

TCCCGAAAAC AGCCTCTGAG GGGTCCTCTG AGCATCCTTC CAGCGTGTTT GG.GAGGCAAA 

CTCGTTGACT AGCTCTTGAG AGGAGTGGCT AGAGGAATCC AGGCGGGGAA GGGGACGGTG 

GACTCCAGGA GAGTGTAATT TACAAAGGCG GGGGGCGGGG ACGCCCAGGT CCGAGTCCCA 

GGACTCTGCG CCGGACGCTT CGCCCGCCCT TTCAGGTCCC CTGCCCGGTC CTCGTACCCG 

CGCGGGTCCG GAGAACCTCT GAGC AC CGGC CCCCAGCCCC CGGG 



MINT33 (SEQ ID NO: 33) 
CCCGGGCAGA AAGGCTCGGA TGGCGGTGGC 
CCTGGTCCCT CCGGGTCACT GTCGGCTAAT 
GAAGAAGTCA GCGCCCGGG 



AGAAAGGCTC GGAGGCGGTG GCCTCAGATC 
TCTGGGGGAA GGACTGGGCA AGGCTGTTTG 
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